Acid alpha-glucosidase and fragments thereof

ABSTRACT

Targeted acid alpha-glucosidase therapeutics that localize to the lysosome are provided. The targeted therapeutics include a therapeutic agent, GAA, and a targeting moiety that binds a receptor on an exterior surface of the cell, permitting proper subcellular localization of the targeted therapeutic upon internalization of the receptor. Nucleic acids, cells, and methods relating to the practice of the invention are also provided.

REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Ser. No. 60/543,812, filed Feb. 10, 2004, the contents of which are incorporated by reference.

BACKGROUND

Acid alpha-glucosidase (GAA) is a lysosomal enzyme that hydrolyzes the alpha 1-4 linkage in maltose and other linear oligosaccharides, including the outer branches of glycogen, thereby breaking down excess glycogen in the lysosome (Hirschhorn et al. (2001) in The Metabolic and Molecular Basis of Inherited Disease, Scriver, et al., eds. (2001), McGraw-Hill: New York, p. 3389-3420). Like other mammalian lysosomal enzymes, GAA is synthesized in the cytosol and traverses the ER where it is glycosylated with N-linked, high mannose type carbohydrate. In the golgi, the high mannose carbohydrate is modified on lysosomal proteins by the addition of mannose-6-phosphate (M6P) which targets these proteins to the lysosome. The M6P-modified proteins are delivered to the lysosome via interaction with either of two M6P receptors. The most favorable form of modification is when two M6Ps are added to a high mannose carbohydrate.

Insufficient GAA activity in the lysosome results in Pompe disease, a disease also known as acid maltase deficiency (AMD), glycogen storage disease type II (GSDII), glycogenosis type II, or GAA deficiency. The diminished enzymatic activity occurs due to a variety of missense and nonsense mutations in the gene encoding GAA. Consequently, glycogen accumulates in the lysosomes of all cells in patients with Pompe disease. In particular, glycogen accumulation is most pronounced in lysosomes of cardiac and skeletal muscle, liver, and other tissues. Accumulated glycogen ultimately impairs muscle function. In the most severe form of Pompe disease, death occurs before two years of age due to cardio-respiratory failure.

Presently, there is no approved treatment available to cure or slow the progress of Pompe disease. Enzyme replacement therapeutics currently in clinical trials require that administered recombinant GAA be taken up by the cells in muscle and liver tissues and be transported to the lysosomes in those cells in an M6P-dependent fashion. However, recombinant GAA produced in engineered CHO cells and in the milk of transgenic rabbits, two sources of enzymes used in recent Pompe enzyme replacement therapy trials, contains extremely little M6P (Van Hove et al. (1996) Proc Natl Acad Sci U S A, 93(1):65-70; and U.S. Pat. No. 6,537,785). Therefore, M6P-dependent delivery of recombinant GAA to lysosomes is not efficient, requiring high dosages and frequent infusions. Accordingly, there remains a need for new, simpler, more efficient, and more cost-effective methods for targeting therapeutic GAA enzymes to patient lysosmoes.

SUMMARY OF THE INVENTION

The present invention permits M6P-independent targeting of human GAA or GAA-like enzymes to patient lysosomes by using a peptide tag-based targeting strategy. As a result, the present invention provides efficient delivery of GAA or GAA-like enzymes into target cells.

The invention relates, in part, to the discovery that GAA can be expressed recombinantly using a plurality of open reading frames encoding polypeptides representing different portions of the GAA protein. When provided together, the resulting polypeptides can cooperate to provide the desired enzymatic activity.

Accordingly, the present invention in one aspect relates to a nucleic acid sequence (such as a DNA sequence) encoding an open reading frame of a polypeptide including an amino acid sequence at least 50% identical (e.g. at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% identical) to amino acid residues 70-790 of human GAA or a fragment thereof. The open reading frame does not include an amino acid sequence at least 50% identical (e.g. at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% identical) to amino acid residues 880-952 of human GAA.

In another aspect, the invention relates to a nucleic acid sequence (such as a DNA sequence) encoding an open reading frame of a polypeptide including an amino acid sequence at least 50% identical (e.g. at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% identical) to amino acid residues 880-952 of human GAA or a fragment thereof. The open reading frame does not include an amino acid sequence at least 50% identical (e.g. at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% identical) to amino acid residues 70-790 of human GAA.

The invention also relates to cells containing one or both such nucleic acid sequences.

In one embodiment, a nucleic acid of the invention also encodes a peptide tag fused to the GAA polypeptide. A preferred peptide tag is a ligand for an extracellular receptor. In some embodiments, a peptide tag is a targeting domain that binds an extracellular domain of a receptor on the surface of a target cell and, upon internalization of the receptor, permits localization of the polypeptide in a human lysosome. In one embodiment, the targeting domain includes a urokinase-type plasminogen receptor moiety capable of binding the cation-independent mannose-6-phosphate receptor. In another embodiment, the targeting domain incorporates one or more amino acid sequences of IGF-II (e.g. at least amino acids 48-55; at least amino acids 8-28 and 41-61; or at least amino acids 8-87) or a sequence variant thereof (e.g. R68A) or truncated form thereof (e.g. C-terminally truncated from position 62) that binds the cation-independent mannose-6-phosphate receptor. In one embodiment, a peptide tag is fused directly to the N- or C-terminus of the GAA polypeptide. In another embodiment, a peptide tag is fused to the N- or C-terminus of the GAA polypeptide by a spacer. In one specific embodiment, a peptide tag is fused to the GAA polypeptide by a spacer of 10-25 amino acids. In another specific embodiment, a peptide tag is fused to the GAA polypeptide by a spacer including glycine residues. In another specific embodiment, a peptide tag is fused to the GAA polypeptide by a spacer including a helical structure. In another specific embodiment, a peptide tag is fused to the GAA polypeptide by a spacer at least 50% identical to the sequence GGGTVGDDDDK.

The invention also relates to polypeptides encoded by the nucleic acids of the invention and to pharmaceutical preparations incorporating those polypeptides.

The invention also relates, in part, to an appreciation of particular positions of GAA to which a peptide tag can be fused. Accordingly, in one aspect the invention relates to a targeted therapeutic including a peptide tag fused to amino acid 68, 69, 70, 71, 72, 779, 787, 789, 790, 791, 792, 793, or 796 of human GAA or a portion thereof. The targeted therapeutic can include, for example, amino acid residues 70-952 of human GAA, or a smaller portion, such as amino acid residues 70-790. In one embodiment, a peptide tag is fused to amino acid 70, or to an amino acid within one or two positions of amino acid 70. In some embodiments, the peptide tag is a ligand for an extracellular receptor. For example, some peptide tags are targeting domains that bind an extracellular domain of a receptor on the surface of a target cell and, upon internalization of the receptor, permit localization of the therapeutic agent to a human lysosome. In one embodiment, the targeting domain includes a urokinase-type plasminogen receptor moiety capable of binding the cation-independent mannose-6-phosphate receptor. In another embodiment, the targeting domain incorporates one or more amino acid sequences of IGF-II (e.g. at least amino acids 45-55; at least amino acids 8-28 and 41-61; or at least amino acids 8-87) sequence variant thereof (e.g. R68A) or truncated form thereof (e.g. C-terminally truncated from position 62) that binds the cation-independent mannose-6-phosphate receptor. The peptide tag is fused to the GAA polypeptide directly or by a spacer. In one specific embodiment, a peptide tag is fused to the GAA polypeptide by a spacer of 10-25 amino acids. In another specific embodiment, a peptide tag is fused to the GAA polypeptide by a spacer including glycine residues. In another specific embodiment, a peptide tag is fused to the GAA polypeptide by a spacer including a helical structure. In another specific embodiment, a peptide tag is fused to the GAA polypeptide by a spacer at least 50% identical to the sequence GGGTVGDDDDK.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1-1 to 1-14 depict an amino acid sequence alignment of selected members of family 31 of glycoside hydrolyases.

FIG. 2 is a schematic depiction of the GAA protein.

FIG. 3 depicts exemplary strategies for creating a peptide-tagged GAA.

FIG. 4 depicts an exemplary uptake experiment using wild-type GAA and SS-GAAΔ1-69.

FIG. 5 depicts an exemplary uptake experiment using SS-GAAΔ1-69 and SS-GILTΔ2-7-GAAΔ1-69.

FIG. 6 depicts an exemplary Western blot analysis of 1-87-IGF-II-tagged GAA proteins: the left panel was probed with an anti-GAA antibody; the right panel was probed with an anti-IGF-II antibody. Lane 1: pCEP-GILT1-87-GAA56-952; lane 2: pCEP-GILT1-87-R68A-GAA56-952-1; lane 3: pCEP-GILT1-87-R68-ΔGS-GAA56-952-1; lane 4: pCEP-GILTΔ2-7-spcr1-GAA70-952-1; lane 5: pCEP-GAA; lane 6: pCEP-GILT-GAA29-952.

FIG. 7 depicts an exemplary Western blot analysis comparing proteolysis of wild-type GAA with GAA-791Asc.

FIG. 8 depicts an exemplary Western analysis of wild-type GAA and GAA constructs with a GILT tag engineered with a downstream Factor X protease site, GAA787GILTXa, GAA779GILTXa, and GAA796GILTXa. It also depicts a GAA C-terminal processing model.

DETAILED DESCRIPTION OF THE INVENTION

This invention provides a means of producing GAA that is more effectively targeted to the lysosomes of mammalian cells, for example, human cardiac and skeletal muscle cells. GAA is a member of family 31 of glycoside hydrolyases (FIGS. 1-1 to 1-14 ). Human GAA is synthesized as a 110 kDal precursor (Wisselaar et al. (1993) J. Biol. Chem. 268(3):2223-31). The mature form of the enzyme is a mixture of monomers of 70 and 76 kDal (Wisselaar et al. (1993) J. Biol. Chem. 268(3):2223-31). The precursor enzyme has seven potential glycosylation sites and four of these are retained in the mature enzyme (Wisselaar et al. (1993) J. Biol. Chem. 268(3):2223-31). The proteolytic cleavage events which produce the mature enzyme occur in late endosomes or in the lysosome (Wisselaar et al. (1993) J. Biol. Chem. 268(3):2223-31).

The C-terminal 160 amino acids are absent from the mature 70 and 76 kDal species. However, certain Pompe alleles resulting in the complete loss of GAA activity map to this region, for example Val949Asp (Becker et al. (1998) J. Hum. Genet. 62:991). The phenotype of this mutant indicates that the C-terminal portion of the protein, although not part of the 70 or 76 kDal species, plays an important role in the function of the protein. It has recently been reported that the C-terminal portion of the protein, although cleaved from the rest of the protein during processing, remains associated with the major species (Moreland et al. (Nov. 1, 2004) J. Biol. Chem., Manuscript 404008200). Accordingly, the C-terminal residues could play a direct role in the catalytic activity of the protein. Alternatively, the C-terminal residues may be involved in promoting proper folding of the N-terminal portions of the protein.

This latter possibility is supported by the behavior of certain alleles of sucrase-isomaltase, a related protein. This family includes the sucrase-isomaltase (SI) protein which contains the two distinct but homologous glycoside hydrolyase catalytic domains in tandem on a single polypeptide. Each of these is similar to the entire GAA polypeptide in size and the two domains share 36 and 39% identity with GAA. SI is expressed in intestinal brush border cells and is localized to the apical membrane of these polarized cells with the catalytic domains facing the gut lumen due to an amino-terminal trans membrane domain. Once arriving at the apical membrane, the sucrase domain is cleaved from the amino-proximal isomaltase domain by trypsin while the isomaltase domain remains membrane associated. Recent studies indicate that the sucrase domain is required for proper folding and subsequent transport of the isomaltase domain; sucrase is said to be an intramolecular chaperone required for the folding of the isomaltase domain (Jacob et al. (2002) J. Biol. Chem. 277:32141).

Analysis of the expression of a number of engineered GAA cassettes has enabled the identification of two regions that, although cleaved from the mature polypeptide, are nevertheless required for secretion of functional protein from mammalian cells. FIG. 2 summarizes the organization of GAA as we now understand it. The precursor polypeptide possesses a signal sequence and an adjacent putative trans-membrane domain, a trefoil domain (PFAM PF00088) which is a cysteine-rich domain of about 45 amino acids containing 3 disulfide linkages and thought to be involved in protein-protein or protein carbohydrate interactions (Thim (1989) FEBS Lett. 250:85), the domain defined by the mature 70/76 kDal polypeptide, and the C-terminal domain. A mutation in the trefoil domain of SI has an impact on the apical membrane sorting pattern of SI (Spodsberg (2001) J. Biol. Chem. 276:23506). Data presented in examples 1 and 2 indicate that both the trefoil domain and the C-terminal domain are required for the production of functional GAA. It is possible that the C-terminal domain interacts with the trefoil domain during protein folding perhaps facilitating appropriate disulfide bond formation in the trefoil domain.

In one embodiment of the invention, DNA sequence encoding a peptide tag is fused in frame to the 3′ terminus of a GAA cassette that encodes the entire GAA polypeptide with the exception of the C-terminal domain. This cassette is co-expressed in mammalian cells that also express the C-terminal domain of GAA as a separate polypeptide.

The C-terminal domain then functions in trans in conjunction with the 70/76 kDal species to generate active GAA. The boundary between the catalytic domain and the C-terminal domain appears to be at about amino acid residue 791, based on its presence in a short region of less than 18 amino acids that is absent from most members of the family 31 hydrolyases and which contains 4 consecutive proline residues in GAA. Indeed, it has now been reported that the C-terminal domain associated with the mature species begins at amino acid residue 792 (Moreland et al. (Nov. 1, 2004) J. Biol. Chem., Manuscript 404008200).

Co-expression can be achieved by driving expression of both polypeptides from one plasmid construct introduced into mammalian cells to produce a stable cell line. Expression can be driven by two promoters on such a plasmid or by one promoter driving expression of a bicistronic construct in which the two cassettes are separated by an IRES element. Alternatively, cell lines expressing both proteins can be constructed sequentially with separate plasmids employing distinct selectable markers.

The peptide tag used in these fusions can be derived from IGF-II to target the CI-MPR. Alternatively, peptide tags that preferentially bind to receptors on the surface of myotubes can be employed. Such peptides have been described (Samoylova et al. (1999) Muscle and Nerve 22:460; U.S. Pat. No. 6,329,501). Other cell surface receptors, such as the Fc receptor, the LDL receptor, or the transferrin receptor are also appropriate targets and can promote targeting of GAA.

In another embodiment of the invention, the cassette encoding the peptide tag is inserted into the native GAA coding sequence at the junction of the mature 70/76 kDal polypeptide and the C-terminal domain, for example at position 791. This creates a single chimeric polypeptide. Because the peptide tag may be unable to bind to its cognate receptor in this configuration, a protease cleavage site may be inserted just downstream of the peptide tag. Once the protein is produced in correct folded form, the C-terminal domain can be cleaved by protease treatment.

It may be desirable to employ a protease cleavage site that is acted upon by a protease normally found in human serum. In this way, the tagged GAA can be introduced into the blood stream in a prodrug form and become activated for uptake by the serum resident protease. This might improve the distribution of the enzyme. As before, the peptide tag could be the GILT tag or a muscle-specific tag.

In another embodiment of the invention, the tag is fused at the N-terminus of GAA in such a way as to retain enzymatic activity (Example 3). In the case of N-terminal fusions, it is possible to affect the level of secretion of the enzyme by substituting a heterologous signal peptide for the native GAA signal peptide.

The GAA signal peptide is not cleaved in the ER thereby causing GAA to be membrane bound in the ER (Tsuji et al. (1987) Biochem. Int. 15(5):945-952). In some cell types, the enzyme can be found bound to the plasma membrane with retention of the membrane topology of the ER presumably due to the failure to cleave the signal peptide (Hirschhorn et al., in The Metabolic and Molecular Basis of Inherited Disease, Valle, ed., 2001, McGraw-Hill: New York, pp. 3389-3420). Sequence analysis suggests the presence of a trans-membrane domain adjacent to the signal peptide, which presumably enables the enzyme to remain membrane attached under certain conditions.

It is possible that membrane association of GAA via its signal peptide is an important contributory factor in correct lysosomal targeting of the enzyme. This could happen in two ways: First, the membrane association could directly steer the protein to the lysosome. Second, the membrane association could increase the residence time of the GAA in the Golgi thereby increasing the level of mannose-6-phosphate added to the protein. This would have the net effect of increasing the proportion of the enzyme that is sorted to the lysosome. In either case, if this membrane association were eliminated, then more of the produced enzyme would be secreted and if the latter model were correct, the secreted enzyme would have less mannose-6-phosphate.

Disruption of the membrane association of GAA can be accomplished by replacing the GAA signal peptide and adjacent sequence with an alternate signal peptide for GAA. In the context of GILT tagging of GAA, the chimeric gene contains the IGF-II tag including its signal peptide fused to the N-terminus of GAA at the native signal peptide cleavage site or at appropriate downstream sites. Such a chimeric fusion will direct the production of a recombinant GAA enzyme that is secreted at high levels and that contains a high affinity ligand for the M6P/IGF-II receptor.

Subcellular Targeting Domains

The present invention permits targeting of a therapeutic agent to a lysosome using a protein, or an analog of a protein, that specifically binds a cellular receptor for that protein. The exterior of the cell surface is topologically equivalent to endosomal, lysosomal, golgi, and endoplasmic reticulum compartments. Thus, endocytosis of a molecule through interaction with an appropriate receptor(s) permits transport of the molecule to any of these compartments without crossing a membrane. Should a genetic deficiency result in a deficit of a particular enzyme activity in any of these compartments, delivery of a therapeutic protein can be achieved by tagging it with a ligand for the appropriate receptor(s).

Multiple pathways directing receptor-bound proteins from the plasma membrane to the golgi and/or endoplasmic reticulum have been characterized. Thus, by using a targeting portion from, for example, SV40, cholera toxin, or the plant toxin ricin, each of which coopt one or more of these subcellular trafficking pathways, a therapeutic can be targeted to the desired location within the cell. In each case, uptake is initiated by binding of the material to the exterior of the cell. For example, SV40 binds to MHC class I receptors, cholera toxin binds to GM1 ganglioside molecules and ricin binds to glycolipids and glycoproteins with terminal galactose on the surface of cells. Following this initial step the molecules reach the ER by a variety of pathways. For example, SV40 undergoes caveolar endocytosis and reaches the ER in a two step process that bypasses the golgi whereas cholera toxin undergoes caveolar endocytosis but traverses the golgi before reaching the ER.

If a targeting moiety related to cholera toxin or ricin is used, it is important that the toxicity of cholera toxin or ricin be avoided. Both cholera toxin and ricin are heteromeric proteins, and the cell surface binding domain and the catalytic activities responsible for toxicity reside on separate polypeptides. Thus, a targeting moiety can be constructed that includes the receptor-binding polypeptide, but not the polypeptide responsible for toxicity. For example, in the case of ricin, the B subunit possesses the galactose binding activity responsible for internalization of the protein, and can be fused to a therapeutic protein. If the further presence of the A subunit improves subcellular localization, a mutant version (mutein) of the A chain that is properly folded but catalytically inert can be provided with the B subunit-therapeutic agent fusion protein.

Proteins delivered to the golgi can be transported to the endoplasmic reticulum (ER) via the KDEL receptor, which retrieves ER-targeted proteins that have escaped to the golgi. Thus, inclusion of a KDEL motif at the terminus of a targeting domain that directs a therapeutic protein to the golgi permits subsequent localization to the ER. For example, a targeting moiety (e.g. an antibody, or a peptide identified by high-throughput screening such as phage display, yeast two hybrid, chip-based assays, and solution-based assays) that binds the cation-independent M6P receptor both at or about pH 7.4 and at or about pH 5.5 permits targeting of a therapeutic agent to the golgi; further addition of a KDEL motif permits targeting to the ER.

Lysosomal Targeting Moieties

The invention permits targeting of a therapeutic agent to a lysosome. Targeting may occur, for example, through binding of a plasma membrane receptor that later passes through a lysosome. Alternatively, targeting may occur through binding of a plasma receptor that later passes through a late endosome; the therapeutic agent can then travel from the late endosome to a lysosome. A preferred lysosomal targeting mechanism involves binding to the cation-independent M6P receptor.

Cation-independent M6P Receptor

The cation-independent M6P receptor is a 275 kDa single chain transmembrane glycoprotein expressed ubiquitously in mammalian tissues. It is one of two mammalian receptors that bind M6P: the second is referred to as the cation-dependent M6P receptor. The cation-dependent M6P receptor requires divalent cations for M6P binding; the cation-independent M6P receptor does not. These receptors play an important role in the trafficking of lysosomal enzymes through recognition of the M6P moiety on high mannose carbohydrate on lysosomal enzymes. The extracellular domain of the cation-independent M6P receptor contains 15 homologous domains (“repeats”) that bind a diverse group of ligands at discrete locations on the receptor.

The cation-independent M6P receptor contains two binding sites for M6P: one located in repeats 1-3 and the other located in repeats 7-9. The receptor binds monovalent M6P ligands with a dissociation constant in the μM range while binding divalent M6P ligands with a dissociation constant in the nM range, probably due to receptor oligomerization. Uptake of IGF-II by the receptor is enhanced by concomitant binding of multivalent M6P ligands such as lysosomal enzymes to the receptor.

The cation-independent M6P receptor also contains binding sites for at least three distinct ligands that can be used as targeting moieties. The cation-independent M6P receptor binds IGF-II with a dissociation constant of about 14 nM at or about pH 7.4, primarily through interactions with repeat 11. Consistent with its function in targeting IGF-II to the lysosome, the dissociation constant is increased approximately 100-fold at or about pH 5.5 promoting dissociation of IGF-II in acidic late endosomes. The receptor is capable of binding high molecular weight O-glycosylated IGF-II forms.

An additional useful ligand for the cation-independent M6P receptor is retinoic acid. Retinoic acid binds to the receptor with a dissociation constant of 2.5 nM. Affinity photolabeling of the cation-independent M6P receptor with retinoic acid does not interfere with IGF-II or M6P binding to the receptor, indicating that retinoic acid binds to a distinct site on the receptor. Binding of retinoic acid to the receptor alters the intracellular distribution of the receptor with a greater accumulation of the receptor in cytoplasmic vesicles and also enhances uptake of M6P modified β-glucuronidase. Retinoic acid has a photoactivatable moiety that can be used to link it to a therapeutic agent without interfering with its ability to bind to the cation-independent M6P receptor.

The cation-independent M6P receptor also binds the urokinase-type plasminogen receptor (uPAR) with a dissociation constant of 9 μM. uPAR is a GPI-anchored receptor on the surface of most cell types where it functions as an adhesion molecule and in the proteolytic activation of plasminogen and TGF-β. Binding of uPAR to the CI-M6P receptor targets it to the lysosome, thereby modulating its activity. Thus, fusing the extracellular domain of uPAR, or a portion thereof competent to bind the cation-independent M6P receptor, to a therapeutic agent permits targeting of the agent to a lysosome.

IGF-II

In a preferred embodiment, the lysosomal targeting portion is a protein, peptide, or other moiety that binds the cation independent M6P/IGF-II receptor in a mannose-6-phosphate-independent manner. Advantageously, this embodiment mimics the normal biological mechanism for uptake of LSD proteins, yet does so in a manner independent of mannose-6-phosphate.

For example, by fusing DNA encoding the mature IGF-II polypeptide to the 3′ end of LSD gene cassettes, fusion proteins are created that can be taken up by a variety of cell types and transported to the lysosome. Alternatively, DNA encoding a precursor IGF-II polypeptide can be fused to the 3′ end of an LSD gene cassette; the precursor includes a carboxyterminal portion that is cleaved in mammalian cells to yield the mature IGF-II polypeptide, but the IGF-II signal peptide is preferably omitted (or moved to the 5′ end of the LSD gene cassette). This method has numerous advantages over methods involving glycosylation including simplicity and cost effectiveness, because once the protein is isolated, no further modifications need be made.

IGF-II is preferably targeted specifically to the M6P receptor. Particularly useful are mutations in the IGF-II polypeptide that result in a protein that binds the M6P receptor with high affinity while no longer binding the other two receptors with appreciable affinity. IGF-II can also be modified to minimize binding to serum IGF-binding proteins (Baxter (2000) Am. J. Physiol Endocrinol Metab. 278(6):967-76) to avoid sequestration of IGF-II/GILT constructs. A number of studies have localized residues in IGF-1 and IGF-II necessary for binding to IGF-binding proteins. Constructs with mutations at these residues can be screened for retention of high affinity binding to the M6P/IGF-II receptor and for reduced affinity for IGF-binding proteins. For example, replacing Phe 26 of IGF-II with Ser is reported to reduce affinity of IGF-II for IGFBP-1 and -6 with no effect on binding to the M6P/IGF-II receptor (Bach et al. (1993) J. Biol. Chem. 268(13):9246-54). Other substitutions, such as Ser for Phe 19 and Lys for Glu 9, can also be advantageous. The analogous mutations, separately or in combination, in a region of IGF-I that is highly conserved with IGF-II result in large decreases in IGF-BP binding (Magee et al. (1999) Biochemistry 38(48):15863-70).

An alternate approach is to identify minimal regions of IGF-II that can bind with high affinity to the M6P/IGF-II receptor. The residues that have been implicated in IGF-II binding to the M6P/IGF-II receptor mostly cluster on one face of IGF-II (Terasawa et al. (1994) EMBO J. 13(23):5590-7). Although IGF-II tertiary structure is normally maintained by three intramolecular disulfide bonds, a peptide incorporating the amino acid sequence on the M6P/IGF-II receptor binding surface of IGF-II can be designed to fold properly and have binding activity. Such a minimal binding peptide is a highly preferred targeting portion. Designed peptides based on the region around amino acids 48-55 can be tested for binding to the M6P/IGF-II receptor. Alternatively, a random library of peptides can be screened for the ability to bind the M6P/IGF-II receptor either via a yeast two hybrid assay, or via a phage display type assay.

Blood-brain Barrier

One challenge in therapy for lysosomal storage diseases is that many of these diseases have significant neurological involvement. Therapeutic enzymes administered into the blood stream generally do not cross the blood brain barrier and therefore cannot relieve neurological symptoms associated with the diseases. IGF-II, however, has been reported to promote transport across the blood-brain barrier via transcytosis (Bickel et al. (2001) Adv. Drug Deliv. Rev. 46(1-3): 247-79). Thus, appropriately designed GILT constructs should be capable of crossing the blood brain barrier, affording for the first time a means of treating neurological symptoms associated with lysosomal storage diseases. The constructs can be tested using GUS minus mice as described in Example 12. Further details regarding design, construction and testing of targeted therapeutics that can reach neuronal tissue from blood are disclosed in U.S. Ser. No. 60/329,650, filed Oct. 16, 2001, and in U.S. Ser. No. 10/136,639, filed Apr. 30, 2002.

Structure of IGF-II

NMR structures of IGF-II have been solved by two groups (Terasawa et al. (1994) EMBO J. 13(23):5590-7; Torres et al. (1995) J. Mol. Biol. 248(2):385-401) (see, e.g., Protein Data Bank record 1IGL). The general features of the IGF-II structure are similar to IGF-I and insulin. The A and B domains of IGF-II correspond to the A and B chains of insulin. Secondary structural features include an alpha helix from residues 11-21 of the B region connected by a reverse turn in residues 22-25 to a short beta strand in residues 26-28. Residues 25-27 appear to form a small antiparallel beta sheet; residues 59-61 and residues 26-28 may also participate in intermolecular beta-sheet formation. In the A domain of IGF-II, alpha helices spanning residues 42-49 and 53-59 are arranged in an antiparallel configuration perpendicular to the B-domain helix. Hydrophobic clusters formed by two of the three disulfide bridges and conserved hydrophobic residues stabilize these secondary structure features. The N and C termini remain poorly defined as is the region between residues 31 -40.

IGF-II binds to the IGF-II/M6P and IGF-I receptors with relatively high affinity and binds with lower affinity to the insulin receptor. IGF-II also interacts with a number if serum IGFBPs.

Binding to the IGF-II/M6P Receptor

Substitution of IGF-II residues 48-50 (Phe Arg Ser) with the corresponding residues from insulin, (Thr Ser Ile), or substitution of residues 54-55 (Ala Leu) with the corresponding residues from IGF-I (Arg Arg) result in diminished binding to the IGF-II/M6P receptor but retention of binding to the IGF-I and insulin receptors (Sakano et al. (1991) J. Biol. Chem. 266(31):20626-35).

IGF-I and IGF-II share identical sequences and structures in the region of residues 48-50 yet have a 1000-fold difference in affinity for the IGF-II receptor. The NMR structure reveals a structural difference between IGF-I and IGF-II in the region of IGF-II residues 53-58 (IGF-I residues 54-59): the alpha-helix is better defined in IGF-II than in IGF-I and, unlike IGF-I, there is no bend in the backbone around residues 53 and 54 (Torres et al. (1995 ) J. Mol. Biol. 248(2):385-401). This structural difference correlates with the substitution of Ala 54 and Leu 55 in IGF-II with Arg 55 and Arg 56 in IGF-I. It is possible either that binding to the IGF-II receptor is disrupted directly by the presence of charged residues in this region or that changes in the structure engendered by the charged residues yield the changes in binding for the IGF-II receptor. In any case, substitution of uncharged residues for the two Arg residues in IGF-I resulted in higher affinities for the IGF-II receptor (Cacciari et al. (1987) Pediatrician 14(3):146-53). Thus the presence of positively charged residues in these positions correlates with loss of binding to the IGF-II receptor.

IGF-II binds to repeat 11 of the cation-independent M6P receptor. Indeed, a minireceptor in which only repeat 11 is fused to the transmembrane and cytoplasmic domains of the cation-independent M6P receptor is capable of binding IGF-II (with an affinity approximately one tenth the affinity of the full length receptor) and mediating internalization of IGF-II and its delivery to lysosomes (Grimme et al. (2000) J. Biol. Chem. 275(43):33697-33703). The structure of domain 11 of the M6P receptor is known (Protein Data Base entries 1GP0 and IGP3; Brown et al. (2002) EMBO J. 21 (5):1054-1062). The putative IGF-II binding site is a hydrophobic pocket believed to interact with hydrophobic amino acids of IGF-II; candidate amino acids of IGF-II include leucine 8, phenylalanine 48, alanine 54, and leucine 55. Although repeat 11 is sufficient for IGF-II binding, constructs including larger portions of the cation-independent M6P receptor (e.g. repeats 10-13, or 1-15) generally bind IGF-II with greater affinity and with increased pH dependence (see, for example, Linnell et al. (2001) J. Biol. Chem. 276 (26):23986-23991).

Binding to the IGF-I Receptor

Substitution of IGF-II residues Tyr 27 with Leu, Leu 43 with Val or Ser 26 with Phe diminishes the affinity of IGF-II for the IGF-I receptor by 94-, 56-, and 4-fold respectively (Torres et al. (1995) J. Mol. Biol. 248 (2):385-401). Deletion of residues 1-7 of human IGF-II resulted in a 30-fold decrease in affinity for the human IGF-I receptor and a concomitant 12 fold increase in affinity for the rat IGF-II receptor (Hashimoto et al. (1995) J. Biol. Chem. 270 (30):18013-8). The NMR structure of IGF-II shows that Thr 7 is located near residues 48 Phe and 50 Ser as well as near the 9 Cys-47 Cys disulfide bridge. It is thought that interaction of Thr 7 with these residues can stabilize the flexible N-terminal hexapeptide required for IGF-I receptor binding (Terasawa et al. (1994) EMBO J. 13 (23)5590-7). At the same time this interaction can modulate binding to the IGF-II receptor. Truncation of the C-terminus of IGF-II (residues 62-67) also appear to lower the affinity of IGF-II for the IGF-I receptor by 5 fold (Roth et al. (1991) Biochem. Biophys. Res. Commun. 181 (2):907-14).

Deletion Mutants of IGF-II

The binding surfaces for the IGF-I and cation-independent M6P receptors are on separate faces of IGF-II. Based on structural and mutational data, functional cation-independent M6P binding domains can be constructed that are substantially smaller than human IGF-II. For example, the amino terminal amino acids 1-7 and/or the carboxy terminal residues 62-67 can be deleted or replaced. Additionally, amino acids 29-40 can likely be eliminated or replaced without altering the folding of the remainder of the polypeptide or binding to the cation-independent M6P receptor. Thus, a targeting moiety including amino acids 8-28 and 41-61 can be constructed. These stretches of amino acids could perhaps be joined directly or separated by a linker. Alternatively, amino acids 8-28 and 41-61 can be provided on separate polypeptide chains. Comparable domains of insulin, which is homologous to IGF-II and has a tertiary structure closely related to the structure of IGF-II, have sufficient structural information to permit proper refolding into the appropriate tertiary structure, even when present in separate polypeptide chains (Wang et al. (1991) Trends Biochem. Sci. 279-281). Thus, for example, amino acids 8-28, or a conservative substitution variant thereof, could be fused to a therapeutic agent; the resulting fusion protein could be admixed with amino acids 41-61, or a conservative substitution variant thereof, and administered to a patient.

In order to facilitate proper presentation and folding of the IGF-II tag, longer portions of IGF-II proteins can be used. For example, an IGF-II tag including amino acid residues 1-67, 1-87, or the entire precursor form can be used.

Binding to IGF Binding Proteins

IGF-II and related constructs can be modified to diminish their affinity for IGFBPs, thereby increasing the bioavailability of the tagged proteins.

Substitution of IGF-II residue phenylalanine 26 with serine reduces binding to IGFBPs 1-5 by 5-75 fold (Bach et al. (1993) J. Biol. Chem. 268 (13):9246-54). Replacement of IGF-II residues 48-50 with threonine-serine-isoleucine reduces binding by more than 100 fold to most of the IGFBPs (Bach et al. (1993) J. Biol. Chem. 268 (13):9246-54); these residues are, however, also important for binding to the cation-independent mannose-6-phosphate receptor. The Y27L substitution that disrupts binding to the IGF-I receptor interferes with formation of the ternary complex with IGFBP3 and acid labile subunit (Hashimoto et al. (1997) J. Biol. Chem. 272 (44):27936-42); this ternary complex accounts for most of the IGF-II in the circulation (Yu et al. (1999) J. Clin. Lab Anal. 13 (4):166-72). Deletion of the first six residues of IGF-II also interferes with IGFBP binding (Luthi et al. (1992) Eur. J. Biochem. 205 (2):483-90).

Studies on IGF-I interaction with IGFBPs revealed additionally that substitution of serine for phenylalanine 16 did not effect secondary structure but decreased IGFBP binding by between 40 and 300 fold (Magee et al. (1999) Biochemistry 38 (48):15863-70). Changing glutamate 9 to lysine also resulted in a significant decrease in IGFBP binding. Furthermore, the double mutant lysine 9/serine 16 exhibited the lowest affinity for IGFBPs. Although these mutations have not previously been tested in IGF-II, the conservation of sequence between this region of IGF-I and IGF-II suggests that a similar effect will be observed when the analogous mutations are made in IGF-II (glutamate 12 lysine/phenylalanine 19 serine).

IGF-II Homologs

The amino acid sequence of human IGF-II, or a portion thereof affecting binding to the cation-independent M6P receptor, may be used as a reference sequence to determine whether a candidate sequence possesses sufficient amino acid similarity to have a reasonable expectation of success in the methods of the present invention. Preferably, variant sequences are at least 70% similar or 60% identical, more preferably at least 75% similar or 65% identical, and most preferably 80% similar or 70% identical to human IGF-II.

To determine whether a candidate peptide region has the requisite percentage similarity or identity to human IGF-II, the candidate amino acid sequence and human IGF-II are first aligned using the dynamic programming algorithm described in Smith and Waterman (1981) J. Mol. Biol. 147:195-197, in combination with the BLOSUM62 substitution matrix described in FIG. 2 of Henikoff and Henikoff (1992) PNAS 89:10915-10919. For the present invention, an appropriate value for the gap insertion penalty is −12, and an appropriate value for the gap extension penalty is −4. Computer programs performing alignments using the algorithm of Smith-Waterman and the BLOSUM62 matrix, such as the GCG program suite (Oxford Molecular Group, Oxford, England), are commercially available and widely used by those skilled in the art.

Once the alignment between the candidate and reference sequence is made, a percent similarity score may be calculated. The individual amino acids of each sequence are compared sequentially according to their similarity to each other. If the value in the BLOSUM62 matrix corresponding to the two aligned amino acids is zero or a negative number, the pairwise similarity score is zero; otherwise the pairwise similarity score is 1.0. The raw similarity score is the sum of the pairwise similarity scores of the aligned amino acids. The raw score is then normalized by dividing it by the number of amino acids in the smaller of the candidate or reference sequences. The normalized raw score is the percent similarity. Alternatively, to calculate a percent identity, the aligned amino acids of each sequence are again compared sequentially. If the amino acids are non-identical, the pairwise identity score is zero; otherwise the pairwise identity score is 1.0. The raw identity score is the sum of the identical aligned amino acids. The raw score is then normalized by dividing it by the number of amino acids in the smaller of the candidate or reference sequences. The normalized raw score is the percent identity. Insertions and deletions are ignored for the purposes of calculating percent similarity and identity. Accordingly, gap penalties are not used in this calculation, although they are used in the initial alignment.

IGF-II Structural Analogs

The known structures of human IGF-II and the cation-independent M6P receptors permit the design of IGF-II analogs and other cation-independent M6P receptor binding proteins using computer-assisted design principles such as those discussed in U.S. Pat. Nos. 6,226,603 and 6,273,598. For example, the known atomic coordinates of IGF-II can be provided to a computer equipped with a conventional computer modeling program, such as INSIGHTII, DISCOVER, or DELPHI, commercially available from Biosym, Technologies Inc., or QUANTA, or CHARMM, commercially available from Molecular Simulations, Inc. These and other software programs allow analysis of molecular structures and simulations that predict the effect of molecular changes on structure and on intermolecular interactions. For example, the software can be used to identify modified analogs with the ability to form additional intermolecular hydrogen or ionic bonds, improving the affinity of the analog for the target receptor.

The software also permits the design of peptides and organic molecules with structural and chemical features that mimic the same features displayed on at least part of the surface of the cation-independent M6P receptor binding face of IGF-II. Because a major contribution to the receptor binding surface is the spatial arrangement of chemically interactive moieties present within the sidechains of amino acids which together define the receptor binding surface, a preferred embodiment of the present invention relates to designing and producing a synthetic organic molecule having a framework that carries chemically interactive moieties in a spatial relationship that mimics the spatial relationship of the chemical moieties disposed on the amino acid sidechains which constitute the cation-independent M6P receptor binding face of IGF-II. Preferred chemical moieties, include but are not limited to, the chemical moieties defined by the amino acid side chains of amino acids constituting the cation-independent M6P receptor binding face of IGF-II. It is understood, therefore, that the receptor binding surface of the IGF-II analog need not comprise amino acid residues but the chemical moieties disposed thereon.

For example, upon identification of relevant chemical groups, the skilled artisan using a conventional computer program can design a small molecule having the receptor interactive chemical moieties disposed upon a suitable carrier framework. Useful computer programs are described in, for example, Dixon (1992) Tibtech 10: 357-363; Tschinke et al. (1993) J. Med. Chem 36: 3863-3870; and Eisen el al. (1994) Proteins: Structure, Function, and Genetics 19: 199-221, the disclosures of which are incorporated herein by reference.

One particular computer program entitled “CAVEAT” searches a database, for example, the Cambridge Structural Database, for structures which have desired spatial orientations of chemical moieties (Bartlett et al. (1989) in “Molecular Recognition: Chemical and Biological Problems” (Roberts, S. M., ed) pp 182-196). The CAVEAT program has been used to design analogs of tendamistat, a 74 residue inhibitor of α-amylase, based on the orientation of selected amino acid side chains in the three-dimensional structure of tendamistat (Bartlett et al. (1989) supra).

Alternatively, upon identification of a series of analogs which mimic the cation-independent M6P receptor binding activity of IGF-II, the skilled artisan may use a variety of computer programs which assist the skilled artisan to develop quantitative structure activity relationships (QSAR) and further to assist in the de novo design of additional morphogen analogs. Other useful computer programs are described in, for example, Connolly-Martin (1991) Methods in Enzymology 203:587-613; Dixon (1992) supra; and Waszkowycz et al. (1994) J. Med. Chenm. 37: 3994-4002.

Fusion Junctions

Where GAA is expressed as a fusion protein with a peptide tag or targeting domain, the peptide tag can be fused directly to the GAA polypeptide or can be separated from the GAA polypeptide by a linker. An amino acid linker incorporates an amino acid sequence other than that appearing at that position in the natural protein and is generally designed to be flexible or to interpose a structure, such as an α-helix, between the two protein moieties. A linker can be relatively short, such as the sequence Gly-Ala-Pro or Gly-Gly-Gly-Gly-Gly-Pro, or can be longer, such as, for example, 10-25 amino acids in length. For example, flexible repeating linkers of 3-4 copies of the sequence (GGGGS) and α-helical repeating linkers of 2-5 copies of the sequence (EAAAK) have been described (Arai et al. (2004) Proteins:Structure, Function and Bioinformatics 57:829-838). The use of another linker, GGGGTVGDDDDK, in the context of an IGF-II fusion protein has also been reported (DiFalco et al. (1997) Biochem. J. 326:407-413). Linkers incorporating an α-helical portion of a human serum protein can be used to minimize immunogenicity of the linker region.

The site of a fusion junction should be selected with care to promote proper folding and activity of both fusion partners and to prevent premature separation of a peptide tag from a GAA polypeptide. FIG. 3 illustrates four exemplary strategies for creating a GILT-tagged GAA, based on the model for the organization of GAA protein as illustrated in FIG. 2.

-   1. Fusion of the tag at the amino terminus. -   2. Insertion of the tag between the trefoil domain and the mature     region. -   3. Insertion of the tag between the mature region and the C-terminal     domain. -   4. Fusion of the tag to the C-terminus of a truncated GAA and     co-expressing the C-terminal domain.     For example, a targeting domain can be fused, directly or by a     spacer, to amino acid 70 of GAA, a position permitting expression of     the protein, catalytic activity of the GAA moiety, and proper     targeting by the targeting moiety as described in Example 4.     Alternatively, a targeting domain can be fused at or near the     cleavage site separating the C-terminal domain of GAA from the     mature polypeptide. This permits synthesis of a GAA protein with an     internal targeting domain, which optionally can be cleaved to     liberate the mature polypeptide or the C-terminal domain from the     targeting domain, depending on placement of cleavage sites.     Alternatively, the mature polypeptide can be synthesized as a fusion     protein at about position 791 without incorporating C-terminal     sequences in the open reading frame of the expression construct.

In order to facilitate folding of the GILT tag, GAA amino acid residues adjacent to the fusion junction can be modified. For example, since it is possible that GAA cystine residues may interfere with proper folding of the GILT tag, the terminal GAA cystine 952 can be deleted or substituted with serine to accommodate a C-terminal GILT tag. The GILT tag can also be fused immediately preceding the final Cys952. The penultimate cys938 can be changed to proline in conjunction with a mutation of the final Cys952 to serine.

Alternatively, a tag can be chemically coupled to a GAA polypeptide.

Targeting Moiety Affinities

Preferred targeting moieties bind to their target receptors with a submicromolar dissociation constant. Generally speaking, lower dissociation constants (e.g. less than 10⁻⁷ M, less than 10⁻⁸ M or less than 10⁻⁹ M) are increasingly preferred. Determination of dissociation constants is preferably determined by surface plasmon resonance as described in Linnell et al. (2001) J. Biol. Chem. 276(26):23986-23991. A soluble form of the extracellular domain of the target receptor (e.g. repeats 1-15 of the cation-independent M6P receptor) is generated and immobilized to a chip through an avidin-biotin interaction. The targeting moiety is passed over the chip, and kinetic and equilibrium constants are detected and calculated by measuring changes in mass associated with the chip surface.

Computation of Sequence Similarity

In order to produce variants of the disclosed sequences that may also serve as catalytic domain, chaperone domain or subcellular targeting domain, any one or more of the naturally-occurring alpha-glucosidases or subcellular targeting domain, such as, for example, IGF-II, disclosed herein may be used as a reference sequence to determine whether a candidate sequence possesses sufficient amino acid similarity to have a reasonable expectation of success in the methods of the present invention. For example, variant sequences of a catalytic domain are at least 50% similar or 30% identical, preferably at least 55% similar or 35% identical, more preferably at least 60% similar or 40% identical, more preferably at least 65% similar or 45% identical, more preferably at least 70% similar or 50% identical, more preferably at least 75% similar or 55% identical, more preferably at least 80% similar or 60% identical, more preferably at least 85% similar or 65% identical, more preferably at least 90% similar or 70% identical, more preferably at least 95% similar or 75% identical, and most preferably 80% identical, 85% identical, 90% identical, or 95% identical to one of the disclosed, naturally-occurring catalytic domain of acid alpha-glucosidase. Variant sequences of a chaperone domain are at least 40% similar or 20% identical, preferably at least 45% similar or 25% identical, more preferably at least 50% similar or 30% identical, more preferably at least 55% similar or 35% identical, more preferably at least 60% similar or 40% identical, more preferably at least 65% similar or 45% identical, more preferably at least 70% similar or 50% identical, more preferably at least 75% similar or 55% identical, more preferably at least 80% similar or 60% identical, more preferably at least 85% similar or 65% identical, more preferably at least 90% similar or 70% identical, more preferably at least 95% similar or 75% identical, and most preferably 80% identical, 85% identical, 90% identical, or 95% identical to one of the disclosed, naturally-occurring chaperone domain of acid alpha-glucosidase. Variant sequences of a targeting domain are at least 70% similar or 60% identical, more preferably at least 75% similar or 65% identical, more preferably 80% similar or 70% identical, more preferably 85% similar or 75% identical, more preferably 90% similar or 80% identical, more preferably 95% similar or 85% identical, and most preferably, 90% identical, or 95% identical to one of the disclosed, naturally-occurring targeting domain.

To determine whether a candidate peptide region has the requisite percentage similarity or identity to a reference polypeptide or peptide oligomer, the candidate amino acid sequence and the reference amino acid sequence are first aligned using the dynamic programming algorithm described in Smith and Waterman (1981), J. Mol. Biol. 147:195-197, in combination with the BLOSUM62 substitution matrix described in FIG. 2 of Henikoff and Henikoff (1992), “Amino acid substitution matrices from protein blocks”, PNAS (1992 Nov), 89:10915-10919. For the present invention, an appropriate value for the gap insertion penalty is −12, and an appropriate value for the gap extension penalty is −4. Computer programs performing alignments using the algorithm of Smith-Waterman and the BLOSUM62 matrix, such as the GCG program suite (Oxford Molecular Group, Oxford, England), are commercially available and widely used by those skilled in the art.

Once the alignment between the candidate and reference sequence is made, a percent similarity score may be calculated. The individual amino acids of each sequence are compared sequentially according to their similarity to each other. If the value in the BLOSUM62 matrix corresponding to the two aligned amino acids is zero or a negative number, the pairwise similarity score is zero; otherwise the pairwise similarity score is 1.0. The raw similarity score is the sum of the pairwise similarity scores of the aligned amino acids. The raw score is then normalized by dividing it by the number of amino acids in the smaller of the candidate or reference sequences. The normalized raw score is the percent similarity. Alternatively, to calculate a percent identity, the aligned amino acids of each sequence are again compared sequentially. If the amino acids are non-identical, the pairwise identity score is zero; otherwise the pairwise identity score is 1.0. The raw identity score is the sum of the identical aligned amino acids. The raw score is then normalized by dividing it by the number of amino acids in the smaller of the candidate or reference sequences. The normalized raw score is the percent identity. Insertions and deletions are ignored for the purposes of calculating percent similarity and identity. Accordingly, gap penalties are not used in this calculation, although they are used in the initial alignment.

Administration

The targeted therapeutics produced according to the present invention can be administered to a mammalian host by any route. Thus, as appropriate, administration can be oral or parenteral, including intravenous and intraperitoneal routes of administration. In addition, administration can be by periodic injections of a bolus of the therapeutic or can be made more continuous by intravenous or intraperitoneal administration from a reservoir which is external (e.g., an i.v. bag). In certain embodiments, the therapeutics of the instant invention can be pharmaceutical-grade. That is, certain embodiments comply with standards of purity and quality control required for administration to humans. Veterinary applications are also within the intended meaning as used herein.

The formulations, both for veterinary and for human medical use, of the therapeutics according to the present invention typically include such therapeutics in association with a pharmaceutically acceptable carrier therefor and optionally other ingredient(s). The carrier(s) can be “acceptable” in the sense of being compatible with the other ingredients of the formulations and not deleterious to the recipient thereof. Pharmaceutically acceptable carriers, in this regard, are intended to include any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. The use of such media and agents for pharmaceutically active substances is known in the art. Except insofar as any conventional media or agent is incompatible with the active compound, use thereof in the compositions is contemplated. Supplementary active compounds (identified according to the invention and/or known in the art) also can be incorporated into the compositions. The formulations can conveniently be presented in dosage unit form and can be prepared by any of the methods well known in the art of pharmacy/microbiology. In general, some formulations are prepared by bringing the therapeutic into association with a liquid carrier or a finely divided solid carrier or both, and then, if necessary, shaping the product into the desired formulation.

A pharmaceutical composition of the invention is formulated to be compatible with its intended route of administration. Examples of routes of administration include oral or parenteral, e.g., intravenous, intradermal, inhalation, transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions used for parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide.

Useful solutions for oral or parenteral administration can be prepared by any of the methods well known in the pharmaceutical art, described, for example, in Remington's Pharmaceutical Sciences, (Gennaro, A., ed.), Mack Pub., 1990. Formulations for parenteral administration also can include glycocholate for buccal administration, methoxysalicylate for rectal administration, or cutric acid for vaginal administration. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic. Suppositories for rectal administration also can be prepared by mixing the drug with a non-irritating excipient such as cocoa butter, other glycerides, or other compositions that are solid at room temperature and liquid at body temperatures. Formulations also can include, for example, polyalkylene glycols such as polyethylene glycol, oils of vegetable origin, hydrogenated naphthalenes, and the like. Formulations for direct administration can include glycerol and other compositions of high viscosity. Other potentially useful parenteral carriers for these therapeutics include ethylene-vinyl acetate copolymer particles, osmotic pumps, implantable infusion systems, and liposomes. Formulations for inhalation administration can contain as excipients, for example, lactose, or can be aqueous solutions containing, for example, polyoxyethylene-9-lauryl ether, glycocholate and deoxycholate, or oily solutions for administration in the form of nasal drops, or as a gel to be applied intranasally. Retention enemas also can be used for rectal delivery.

Formulations of the present invention suitable for oral administration can be in the form of discrete units such as capsules, gelatin capsules, sachets, tablets, troches, or lozenges, each containing a predetermined amount of the drug; in the form of a powder or granules; in the form of a solution or a suspension in an aqueous liquid or non-aqueous liquid; or in the form of an oil-in-water emulsion or a water-in-oil emulsion. The therapeutic can also be administered in the form of a bolus, electuary or paste. A tablet can be made by compressing or moulding the drug optionally with one or more accessory ingredients. Compressed tablets can be prepared by compressing, in a suitable machine, the drug in a free-flowing form such as a powder or granules, optionally mixed by a binder, lubricant, inert diluent, surface active or dispersing agent. Molded tablets can be made by molding, in a suitable machine, a mixture of the powdered drug and suitable carrier moistened with an inert liquid diluent.

Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients. Oral compositions prepared using a fluid carrier for use as a mouthwash include the compound in the fluid carrier and are applied orally and swished and expectorated or swallowed. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose; a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.

Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions (where water soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor ELTM (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In all cases, the composition can be sterile and can be fluid to the extent that easy syringability exists. It can be stable under the conditions of manufacture and storage and can be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, and sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent which delays absorption, for example, aluminum monostearate and gelatin.

Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, methods of preparation include vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.

Formulations suitable for intra-articular administration can be in the form of a sterile aqueous preparation of the therapeutic which can be in microcrystalline form, for example, in the form of an aqueous microcrystalline suspension. Liposomal formulations or biodegradable polymer systems can also be used to present the therapeutic for both intra-articular and ophthalmic administration.

Formulations suitable for topical administration, including eye treatment, include liquid or semi-liquid preparations such as liniments, lotions, gels, applicants, oil-in-water or water-in-oil emulsions such as creams, ointments or pasts; or solutions or suspensions such as drops. Formulations for topical administration to the skin surface can be prepared by dispersing the therapeutic with a dermatologically acceptable carrier such as a lotion, cream, ointment or soap. In some embodiments, useful are carriers capable of forming a film or layer over the skin to localize application and inhibit removal. Where adhesion to a tissue surface is desired the composition can include the therapeutic dispersed in a fibrinogen-thrombin composition or other bioadhesive. The therapeutic then can be painted, sprayed or otherwise applied to the desired tissue surface. For topical administration to internal tissue surfaces, the agent can be dispersed in a liquid tissue adhesive or other substance known to enhance adsorption to a tissue surface. For example, hydroxypropylcellulose or fibrinogen/thrombin solutions can be used to advantage. Alternatively, tissue-coating solutions, such as pectin-containing formulations can be used.

For inhalation treatments, such as for asthma, inhalation of powder (self-propelling or spray formulations) dispensed with a spray can, a nebulizer, or an atomizer can be used. Such formulations can be in the form of a finely comminuted powder for pulmonary administration from a powder inhalation device or self-propelling powder-dispensing formulations. In the case of self-propelling solution and spray formulations, the effect can be achieved either by choice of a valve having the desired spray characteristics (i.e., being capable of producing a spray having the desired particle size) or by incorporating the active ingredient as a suspended powder in controlled particle size. For administration by inhalation, the therapeutics also can be delivered in the form of an aerosol spray from a pressured container or dispenser which contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer. Nasal drops also can be used.

Systemic administration also can be by transmucosal or transdermal means. For transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants generally are known in the art, and include, for example, for transmucosal administration, detergents, bile salts, and filsidic acid derivatives. Transmucosal administration can be accomplished through the use of nasal sprays or suppositories. For transdermal administration, the therapeutics typically are formulated into ointments, salves, gels, or creams as generally known in the art.

In one embodiment, the therapeutics are prepared with carriers that will protect against rapid elimination from the body, such as a controlled release formulation, including implants and microencapsulated delivery systems. Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be apparent to those skilled in the art. The materials also can be obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions can also be used as pharmaceutically acceptable carriers. These can be prepared according to methods known to those skilled in the art, for example, as described in U.S. Pat. No. 4,522,811. Microsomes and microparticles also can be used.

Oral or parenteral compositions can be formulated in dosage unit form for ease of administration and uniformity of dosage. Dosage unit form refers to physically discrete units suited as unitary dosages for the subject to be treated; each unit containing a predetermined quantity of active compound calculated to produce the desired therapeutic effect in association with the required pharmaceutical carrier. The specification for the dosage unit forms of the invention are dictated by and directly dependent on the unique characteristics of the active compound and the particular therapeutic effect to be achieved, and the limitations inherent in the art of compounding such an active compound for the treatment of individuals.

Generally, the therapeutics identified according to the invention can be formulated for parenteral or oral administration to humans or other mammals, for example, in therapeutically effective amounts, e.g., amounts which provide appropriate concentrations of the drug to target tissue for a time sufficient to induce the desired effect. Additionally, the therapeutics of the present invention can be administered alone or in combination with other molecules known to have a beneficial effect on the particular disease or indication of interest. By way of example only, useful cofactors include symptom-alleviating cofactors, including antiseptics, antibiotics, antiviral and antifungal agents and analgesics and anesthetics.

The effective concentration of the therapeutics identified according to the invention that is to be delivered in a therapeutic composition will vary depending upon a number of factors, including the final desired dosage of the drug to be administered and the route of administration. The preferred dosage to be administered also is likely to depend on such variables as the type and extent of disease or indication to be treated, the overall health status of the particular patient, the relative biological efficacy of the therapeutic delivered, the formulation of the therapeutic, the presence and types of excipients in the formulation, and the route of administration. In some embodiments, the therapeutics of this invention can be provided to an individual using typical dose units deduced from the earlier-described mammalian studies using non-human primates and rodents. As described above, a dosage unit refers to a unitary, i.e. a single dose which is capable of being administered to a patient, and which can be readily handled and packed, remaining as a physically and biologically stable unit dose comprising either the therapeutic as such or a mixture of it with solid or liquid pharmaceutical diluents or carriers.

In certain embodiments, organisms are engineered to produce the therapeutics identified according to the invention. These organisms can release the therapeutic for harvesting or can be introduced directly to a patient. In another series of embodiments, cells can be utilized to serve as a carrier of the therapeutics identified according to the invention.

Therapeutics of the invention also include the “prodrug” derivatives. The term prodrug refers to a pharmacologically inactive (or partially inactive) derivative of a parent molecule that requires biotransformation, either spontaneous or enzymatic, within the organism to release or activate the active component. Prodrugs are variations or derivatives of the therapeutics of the invention which have groups cleavable under metabolic conditions. Prodrugs become the therapeutics of the invention which are pharmaceutically active in vivo, when they undergo solvolysis under physiological conditions or undergo enzymatic degradation. Prodrug of this invention can be called single, double, triple, and so on, depending on the number of biotransformation steps required to release or activate the active drug component within the organism, and indicating the number of functionalities present in a precursor-type form. Prodrug forms often offer advantages of solubility, tissue compatibility, or delayed release in the mammalian organism (see, Bundgard, Design of Prodrugs, pp. 7-9, 21-24, Elsevier, Amsterdam 1985 and Silverman, The Organic Chemistry of Drug Design and Drug Action, pp. 352-401, Academic Press, San Diego, Calif., 1992). Moreover, the prodrug derivatives according to this invention can be combined with other features to enhance bioavailability.

EXAMPLES Example 1 Trans Expression of GAA.

The following primers were used to generate a gene cassette containing the human IGF-II signal sequence fused to human GAA residues 791-952 (the C-terminal domain). (SEQ ID NO:_) GAA41: GGAATTCAGGCGCGCCGGCAGCTCCCCGTGAGCCAGCC (SEQ ID NO:_) GAA 27: GCTCTAGACTAACACCAGCTGACGAGAAACTGC GAA41 and GAA27 were used to amplify the C-terminal domain of GAA by PCR. The amplified fragment contains an Asc I site at the 5′ terminus. The SS N-tag encoding the IGF-II signal sequence (residues 1-25) with an AscI site at the 3′ end was then fused at the Asc I site to the GAA C-terminal domain and the cassette was cloned in pCEP4 to generate plasmid pCEP-SS-GAA-791-952. The SS N-tag nucleic acid sequence is shown as below.

DNA sequence of the SS N-tag: gaattcACACCAATGGGAATCCCAATGGGGAAGTCGATGCTGGTGCTTCT CACCTTCTTGGCCTTCGCCTCGTGCTGCATTGCTGCTggcgcgccg

The following additional plasmids were generated similarly: pCEP-GAA Δ 817-952 that lacks C-terminal GAA residues 817-952 and pCEP-GAA Δ 817-952-GILT Δ 1-7 that is similar to pCEP-GAA Δ 817-952 except for the addition of a C-terminal GILTΔ1-7 tag. GAA Δ 817-952 was generated by introducing a stop codon after amino acid residue 816. To facilitate the cloning process, the stop codon was followed by 3′ end XbaI restriction site and 5′ end contains an EcoRI restriction site. DNA and amino acid sequences of GAA Δ 817-952 are shown below.

DNA sequence of GAA Δ 817-952. gaattcCAAACCATGGGAGTGAGGCACCCGCCCTGCTCCCACCGGCTCCT GGCCGTCTGCGCCCTCGTGTCCTTGGCAACCGCTGCACTCCTGGGGCACA TCCTACTCCATGATTTCCTGCTGGTTCCCCGAGAGCTGAGTGGCTCCTCC CCAGTCCTGGAGGAGACTCACCCAGCTCACCAGCAGGGAGCCAGCAGACC AGGGCCCCGGGATGCCCAGGCACACCCCGGCCGTCCCAGAGCAGTGCCCA CACAGTGCGACGTCCCCCCCAACAGCCGCTTCGATTGCGCCCCTGACAAG GCCATCACCCAGGAACAGTGCGAGGCCCGCGGCTGCTGCTACATCCCTGC AAAGCAGGGGCTGCAGGGAGCCCAGATGGGGCAGCCCTGGTGCTTCTTCC CACCCAGCTACCCCAGCTACAAGCTGGAGAACCTGAGCTCCTCTGAAATG GGCTACACGGCCACCCTGACCCGTACCACCCCCACCTTCTTCCCCAAGGA CATCCTGACCCTGCGGCTGGACGTGATGATGGAGACTGAGAACCGCCTCC ACTTCACGATCAAAGATCCAGCTAACAGGCGCTACGAGGTGCCCTTGGAG ACCCCGCGTGTCCACAGCCGGGCACCGTCCCCACTCTACAGCGTGGAGTT CTCtGAGGAGCCCTTCGGGGTGATCGTGCACCGGCAGCTGGACGGCCGCG TGCTGCTGAACACGACGGTGGCGCCCCTGTTCTTTGCGGACCAGTTCCTT CAGCTGTCCACCTCGCTGCCCTCGCAGTATATCACAGGCCTCGCCGAGCA CCTCAGTCCCCTGATGCTCAGCACCAGCTGGACCAGGATCACCCTGTGGA ACCGGGACCTTGCGCCCACGCCCGGTGCGAACCTCTACGGGTCTCACCCT TTCTACCTGGCGCTGGAGGACGGCGGGTCGGCACACGGGGTGTTCCTGCT AAACAGCAATGCCATGGATGTGGTCCTGCAGCCGAGCCCTGCCCTTAGCT GGAGGTCGACAGGTGGGATCCTGGATGTCTACATCTTCCTGGGCCCAGAG CCCAAGAGCGTGGTGCAGCAGTACCTGGACGTTGTGGGATACCCGTTCAT GCCGCCATACTGGGGCCTGGGCTTCCACCTGTGCCGCTGGGGCTACTCCT CCACCGCTATCACCCGCCAGGTGGTGGAGAACATGACCAGGGCCCACTTC CCCCTGGACGTCCAATGGAACGACCTGGACTACATGGACTCCCGGAGGGA CTTCACGTTCAACAAGGATGGCTTCCGGGACTTCCCGGCCATGGTGCAGG AGCTGCACCAGGGCGGCCGGCGCTACATGATGATCGTGGATCCTGCCATC AGCAGCTCGGGCCCTGCCGGGAGCTACAGGCCCTACGACGAGGGTCTGCG GAGGGGGGTTTTCATCACCAACGAGACCGGCCAGCCGCTGATTGGGAAGG TATGGCCCGGGTCCACTGCCTTCCCCGACTTCACCAACCCCACAGCCCTG GCCTGGTGGGAGGACATGGTGGCTGAGTTCCATGACCAGGTGCCCTTCGA CGGCATGTGGATTGACATGAACGAGCCTTCCAACTTCATCAGGGGCTCTG AGGACGGCTGCCCCAACAATGAGCTGGAGAACCCACCCTACGTGCCTGGG GTGGTTGGGGGGACCCTCCAGGCGGCAACCATCTGTGCCTCCAGCCACCA GTTTCTCTCCACACACTACAACCTGCACAACCTCTACGGCCTGACCGAAG CCATCGCCTCCCACAGGGCGCTGGTGAAGGCTCGGGGGACACGCCCATTT GTGATCTCCCGCTCGACCTTTGCTGGCCACGGCCGATACGCCGGCCACTG GACGGGGGACGTGTGGAGCTCCTGGGAGCAGCTCGCCTCCTCCGTGCCAG AAATCCTGCAGTTTAACCTGCTGGGGGTGCCTCTGGTCGGGGCCGACGTC TGCGGCTTCCTGGGCAACACCTCAGAGGAGCTGTGTGTGCGCTGGACCCA GCTGGGGGCCTTCTACCCCTTCATGCGGAACCACAACAGCCTGCTCAGTC TGCCCCAGGAGCCGTACAGCTTCAGCGAGCCGGCCCAGCAGGCCATGAGG AAGGCCCTCACCCTGCGCTACGCACTCCTCCCCCACCTCTACACGCTGTT CCACCAGGCCCACGTCGCGGGGGAGACCGTGGCCCGGCCCCTCTTCCTGG AGTTCCCCAAGGACTCTAGCACCTGGACTGTGGACCACCAGCTCCTGTGG GGGGAGGCCCTGCTCATCACCCCAGTGCTCCAGGCCGGGAAGGCCGAAGT GACTGGCTACTTCCCCTTGGGCACATGGTACGACCTGCAGACGGTGCCAA TAGAGGCCCTTGGCAGCCTCCCACCCCCACCTGCAGCTCCCCGTGAGCCA GCCATCCACAGCGAGGGGCAGTGGGTGACGCTGCCGGCCCCCCTGGACAC CATCAACGTCTAGtctaga

Amino acid sequence of GAA Δ 817-952. MGVRHPPCSHRLLAVCALVSLATAALLGHILLHDFLLVPRELSGSSPVLE ETHPAHQQGASRPGPRDAQAHPGRPRAVPTQCDVPPNSRFDCAPDKAITQ EQCEARGCCYIPAKQGLQGAQMGQPWCFFPPSYPSYKLENLSSSEMGYTA TLTRTTPTFFPKDILTLRLDVMMETENRLHFTIKDPANRRYEVPLETPRV HSRAPSPLYSVEFSEEPFGVIVHRQLDGRVLLNTTVAPLFFADQFLQLST SLPSQYITGLAEHLSPLMLSTSWTRITLWNRDLAPTPGANLYGSHPFYLA LEDGGSAHGVFLLNSNAMDVVLQPSPALSWRSTGGILDVYIFLGPEPKSV VQQGYPFMPPYWGLGFHLCRWGYSSTAITRQVVENMTRAHFPLDVQWNDL DYMDSRRDFTFNKDGFRDFPAMVQELHQGGRRYMMIVDPAISSSGPAGSY RPRDEGLRRGVFITNETGQPLIGKVWPGSTAFPDFTNPTALAWWEDMVAE FHDQVPFDGMWIDMNEPSNFIRGSEDGCPNNELENPPYVPGVVGGTLQAA TICASSHQFLSTHYNLHNLYGLTEAIASHRALVKARGTRPFVISRSTFAG HGRYAGHWTGDVWSSWEQLASSVPEILQFNLLGVPLVGADVCGFLGNTSE ELCVRWTQLGAFYPFMRNHNSLLSLPQEPYSFSEPAQQAMRKALTLRYAL LPHLYTLFHQAHVAGETVARPLFLEFPKDSSTWTVDHQLLWGEALLITPV LQAGKAEVTGYFPLGTWYDLQTVPIEALGSLPPPPAAPREPAIHSEGQWV TLPAPLDTINV.

To determine if the GAA C-terminal region functions when expressed in trans, pCEP-SS-GAA-791-952 was transfected into HEK293 cells alone as well as in combination with either plasmid pCEP-GAA Δ 817-952 or with pCEP-GAA Δ 817-952 -GILT Δ 1-7. As controls, pCEP-GAA Δ 817-952 and pCEP-GAA Δ 817-952-GILT Δ 1-7 were also transfected into HEK293 cells alone. Standard transfection methods were used for the experiments. For single plasmid transfections, 1 μg of plasmid DNA was used. For co-tansfections, 0.5 μg of each plasmid were used. 1 μg of total DNA was mixed with 96 μL of HEK293 growth media lacking serum and 4 μL FuGene6 (Roche) as directed by the manufacturer. 50 μL of the mixture were added to each duplicate well of HEK293 cells growing in 12-well plates in 1 mL Dulbecco's Modified Eagles Media supplemented with 1.5 g/L sodium bicarbonate, 10% heat-inactivated FBS, and 4 mM L-glutamine. Cells were incubated 2-3 days at 37° C. in 5% CO₂.

Growth media were collected and assayed to determine GAA activities as described (Reuser, A. J., et al. (1978) Am. J. Hum. Genet. 30:132-143). No GAA activity was detected in the media collected from HEK293 cells transfected with single plasmids. By contrast, GAA activities were present in the growth media collected from HEK293 cells co-transfected with pCEP-SS-GAA-791-952 and either pCEP-GAAΔ817-952 or pCEP-GAAΔ817-952-GILTΔ1-7 (Table 1).

Therefore, the two C-terminal deletion constructs pCEP-GAAΔ817-952 and pCEP-GAAΔ817-952-GILTΔ1-7 only express functional proteins when co-expressed with the C-terminal domain plasmid, pCEP-SS-GAA-791-952. This experiment demonstrated that the C-terminal GAA region cooperates with the mature, N-terminal region when coexpressed in trans. TABLE 1 Transient co-transfection of GAA C-terminal and N-terminal domains. GAA activity Plasmid 1 Plasmid 2 (nmol/hr-ml) pCEP-SS-GAA-791-952 None 0 pCEP-GAAΔ817-952 None 0 pCEP-GAAΔ817-952-GILTΔ1-7 None 0 PCEP-SS-GAA-791-952 pCEP-GAAΔ817-952 14 PCEP-SS-GAA-791-952 pCEP-GAAΔ817- 3 952-GILTΔ1-7

Examnple 2 Region Required for Efficient GAA Trans-expression

In the transient co-transfection experiment described in Example 1, the GAA region including amino acid residues 792-817 is present in both halves of the trans-expression constructs. In order to determine if the overlap of this region is necessary for efficient trans-expression, a pair of constructs, pCEP-GAAΔ791-952-GILTΔ1-7 and PCEP-SS-GAA-791-952, were designed with no overlap and the GILT tag was fused at position 791. As indicated in Table 2, transient co-transfection experiments demonstrated that the presence of amino acid residues 792-817 within the C-terminal domain is required for efficient GAA trans-expression. TABLE 2 Amino acid residues 792-817 is required for efficient GAA trans-expression. GAA activity Plasmid 1 Plasmid 2 (nmol/hr-ml) PCEP-GAA 58 PCEP-SS-GAA-791-952 1 PCEP-GAAΔ817-952-GILTΔ1-7 1 pCEP-GAAΔ791-952-GILTΔ1-7 1 pCEP-SS-GAA-817-952 1 PCEP-SS-GAA-791-952 pCEP-GAAΔ791- 2 952-GILTΔ1-7 PCEP-SS-GAA-791-952 PCEP-GAAΔ817- 16 952-GILTΔ1-7 pCEP-SS-GAA-817-952 pCEP-GAAΔ791- 1 952-GILTΔ1-7 pCEP-SS-GAA-817-952 PCEP-GAAΔ817- 1 952-GILTΔ1-7

Example 3 Construction of a GAA Protein With an Internal GILT Tag

PCR was used to first generate an insertion of the nucleotide sequence GGCGCGCCG (SEQ ID NO:_) after nucleotide 2370 of the complete human GAA sequence (SEQ ID NO:_). This insertion forms an AscI restriction site preceding Ala791. The GILT tag was PCR-amplified with the following DNA oligos: (SEQ ID NO:_) IGF7: gctctagaggcgcgccCTCGGACTTGGCGGGGGTAGC (SEQ ID NO:_) IGF8: ggaattcaggcgcgccgGCTTACCGCCCCAGTGAGAC The amplified GILT tag contains an AscI restriction site at each terminus. This GILT tag was digested with AscI and inserted into the AscI site preceding GAA Ala791 as described above. DNA sequencing confirmed the in-frame orientation of the GILT insertion. This GAA cassette containing an internal GILT tag preceding Ala791 was expressed in vector pCEP4 in a plasmid named pCEP-GAA-IRGILT-4. pCEP-GAA-IRGILT-4 was found to contain a PCR-generated mutation T1712C within the GAA coding sequence. This construct produced functional GAA protein.

Example 4 GAA Deletion Constructs With N-terminal GILT Tag

A set of five tags suitable for N-terminal GAA expression (N-tags) were generated by PCR amplification using primers indicated in Table 3. The GILT N-tag contains the native IGF-II signal sequence and complete GILT epitope. The SS N-tag contains only the IGF-II signal sequence.

For example, the GILTΔ1-7 N-tag contains the IGF-II signal sequence and GILT epitope residues 8-67. It was generated with three PCR reactions: (1) PCR amplification from human IGF-II DNA template using primers IGF1 and IGF4; (2) PCR amplification from human IGF-II DNA template using primers IGF2 and IGF7; and (3) PCR amplification from the products of the first two PCR reactions using primers IGF1 and IGF7.

The GILTΔ2-7 N-tag contains the IGF-II signal sequence, and GILT epitope residue 1 followed by residues 8-67. It was generated with three PCR reactions: (1) PCR amplification from human IGF-II DNA template using primers IGF1 and IGF5; (2) PCR amplification from human IGF-II DNA template using primers IGF3 and IGF7; and (3) PCR amplification from the products of the first two PCR reactions using primers IGF1 and IGF7.

The SSGAA-GILT N-tag contains the GAA signal sequence within residues 1-69, followed by the complete GILT epitope. It was generated with three PCR reactions: (1) PCR amplification from human GAA DNA template using primers GAA13 and GI1; (2) PCR amplification from human IGF-II template using primers GI2 and IGF7; and (3) PCR amplification from the products of the first two PCR reactions using primers GAA13 and IGF7.

Each N-tag contains a 5′ EcoRI restriction site and 3′ AscI and XbaI sites. The AscI site was used to fuse each tag to the GAA N-terminal deletion constructs described below. TABLE 3 N-terminal tag constructs. N-Tag PCR Name DNA Primers Template GILT IGF1: GGAATTCACACCAATGGGAATCC Human IGF-II CAATGG (SEQ ID_) IGF7: GCTCTAGAGGCGCGCCCTCGGACTTGGC GGGGGTAGC (SEQ ID NO:_) SS IGF1: GGAATTCACACCAATGGGAATCC Human IGF-II CAATGG (SEQ ID NO:_) IGF6: GCTCTAGAGGCGCGCCAGCAGCAATGCAG CACGAGG (SEQ ID NO:_) GILTΔ1-7 IGF1: GGAATTCACACCAATGGGAATCC Human IGF-II CAATGG (SEQ ID NO:_) IGF4: ACCAGCTCCCCGCCGCACAGAGCAATGCA GCACGAGGCG (SEQ ID NO:_) IGF2: Human IGF-II TCGCCTCGTGCTGCATTGCTCTGTGCGG CGGGGAGCTGG (SEQ ID NO:_) IGF7: GCTCTAGAGGCGCGCCCTCGGACTTGGCG GGGGTAGC(SEQ ID NO:_) GILTΔ2-7 IGF1: GGAATTCACACCAATGGGAATCC Human IGF-II CAATGG (SEQ ID NO:_) IGF5: ACCAGCTCCCCGCCGCACAGAGCAGCAAT GCAGCACGAGG (SEQ ID NO:_) IGF3: Human IGF-II CCTCGTGCTGCATTGCTGCTCTGTGCGGC GGGGAGCTGG (SEQ ID NO:_) IGF7: GCTCTAGAGGCGCGCCCTCGGACTTGGCG GGGGTAGC (SEQ ID NO:_) SSGAA- GAA13: Human GAA GILT GGAATTCCAACCATGGGAGTGAGGCACCC GCCC GI1: Human IGF-II GGGTCTCACTGGGGCGGTATGCCTGGGCA TCCCGGGGCC (SEQ ID NO:_) GI2: GGCCCCGGGATGCCCAGGCATACCGCCCC AGTGAGACCC (SEQ ID NO:_) IGF7: GCTCTAGAGGCGCGCCCTCGGACTTGGCG GGGGTAGC (SEQ ID NO:_)

Portions of the N-terminal human GAA DNA sequence were deleted and replaced with an AscI restriction site using PCR techniques. 5′ DNA oligos used to define the site of deletion are listed below (Table 4). 5′ oligos were paired with various 3′ oligos within the GAA coding sequence, and the resulting DNA fragments were subsequently fused to the complete C-terminal GAA coding sequence. The N-terminal AscI sites were fused to one of the five N-terminal tags (N-tags) listed above to complete the expression cassettes (Table 4). TABLE 4 GAA N-terminal deletion constructs. Sequnces complementary to GAA coding sequence are in upper case. EcoRI and AscI restriction sites are in lower case. Deletion Name 5′ DNA Oligos GAAΔ1-24 GAA32: ggaattcaggcgcgccgGCACTCCTGGGGC ACATCC (SEQ ID NO:_) GAAΔ1-28 GAA28: ggaattcaggcgcgccgCACATCCTACTCC ATGATTTC (SEQ ID NO:_) GAAΔ1-55 GAA29: ggaattcaggcgcgccgCACCAGCAGGGAG CCAGCAG (SEQ ID NO:_) GAAΔ1-69 GAA30: ggaattcaggcgcgccgGCACACCCCGGCC GTCCCAG (SEQ ID NO:_) GAAΔ1-80 GAA39: ggaattcaggcgcgccgCAGTGCGACGTCC CaCCCAAC (SEQ ID NO:_) GAAΔ1-122 GAA33: ggaattcaggcgcgccgGGGCAGCCCTGGT GCTTCTTC (SEQ ID NO:_) GAAΔ1-203 GAA34: ggaattcaggcgcgccgGCACCGTCCCCAC TCTACAG (SEQ ID NO:_)

The expression cassettes listed in Table 4 contain an N-terminal tag (N-tag) fused to an N-terminal GAA deletion at a mutual AscI site. The cassettes were cloned into the multiple cloning site of expression vector pCEP4 and transfected into HEK293 cells using the FuGene6 transfection reagent (Roche). Media from transient expression were collected 2-3 days post transfection and assayed for secreted GAA activity using a standard enzymatic assay (Reuser, A. J., et al. (1978) Am. J. Hum. Genet. 30:132-143). TABLE 5 Relative transient expression of N-tagged GAA constructs. Plasmid Name Relative Transient Vector N-tag GAA Deletion Expression PCEP GILT GAAΔ1-24 + PCEP SS GAAΔ1-24 ++ PCEP GILTΔ1-7 GAAΔ1-24 − PCEP GILTΔ2-7 GAAΔ1-24 − PCEP SSGAA-GILT GAAΔ1-24 − PCEP GILT GAAΔ1-28 ++ PCEP SS GAAΔ1-28 ++ PCEP GILTΔ1-7 GAAΔ1-28 + PCEP GILTΔ2-7 GAAΔ1-28 ++ PCEP SSGAA-GILT GAAΔ1-28 ++ PCEP GILT GAAΔ1-55 ++ PCEP SS GAAΔ1-55 ++++ PCEP GILTΔ1-7 GAAΔ1-55 ++ PCEP GILTΔ2-7 GAAΔ1-55 ++ PCEP SSGAA-GILT GAAΔ1-55 ++ PCEP GILT GAAΔ1-69 ++ PCEP SS GAAΔ1-69 ++++ PCEP GILTΔ2-7 GAAΔ1-69 ++ PCEP SSGAA-GILT GAAΔ1-69 + PCEP GILT GAAΔ1-80 ++ PCEP SS GAAΔ1-80 ++++ PCEP GILTΔ1-7 GAAΔ1-80 ++ PCEP GILTΔ2-7 GAAΔ1-80 ++ PCEP SSGAA-GILT GAAΔ1-80 + PCEP GILT GAAΔ1-122 − PCEP SS GAAΔ1-122 − PCEP GILTΔ1-7 GAAΔ1-122 − PCEP GILTΔ2-7 GAAΔ1-122 − PCEP SSGAA-GILT GAAΔ1-122 − PCEP GILT GAAΔ1-203 − PCEP SS GAAΔ1-203 − PCEP GILTΔ1-7 GAAΔ1-203 − PCEP GILTΔ2-7 GAAΔ1-203 − PCEP SSGAA-GILT GAAΔ1-203 − PCEP GAA ++

As these data indicated, the N-terminal portion of GAA including residues 1-80 is dispensable for transient expression, but deletions that disrupt or eliminate the trefoil domain do not produce functional protein.

Furthermore, as indicated in Table 6, the secretion of GAA can be improved by appropriately positioning a heterologous signal peptide, in this case the IGF-II signal peptide. Positioning the IGF-II signal peptide at either residue 56 or 70 of GAA gave a three fold increase in GAA secretion compared to native GAA, while positioning the IGF-II signal peptide at position 29 did not. This may be due to the retention of a putative trans-membrane domain adjacent to the GAA signal peptide. TABLE 6 Changing GAA signal sequence positions affects GAA secretion Transient GAA Activity (nmol/hr-mL) Plasmid Experiment 1 Experiment 2 pCEP-GAA 121 111 PCEP-SS-GAAΔ1-28 NA  89 pCEP-SS-GAAΔ1-55 402 NA pCEP-SS-GAAΔ1-69 325 NA

However, replacement of the native GAA signal peptide and trans-membrane domain with a heterologous signal peptide lowers the level of mannose-6-phosphate dependant cellular uptake associated with the protein (FIG. 4). Uptake experiment was described in U.S. Patent Application Nos. 20040005309 and 20040006008, the contents of which are hereby incorporated by reference. As illustrated in FIG. 4, pCEP-SS-GAAΔ1-69 had one third the amount of uptake into Pompe fibroblasts as did wild-type pCEP-GAA.

In contrast, as illustrated in FIG. 5, by comparing uptake of pCEP-SS-GAAΔ1-69 with a construct with GILT tag, pCEP-SS-GILTΔ2-7-GAAΔ1-69, it was evident that the GILT tag promotes specific uptake that can be competed by IGF-II. Thus, placement of the peptide tag at position 70 not only permits efficient expression of the fusion protein and GAA activity, but also provides a peptide tag that is properly folded and accessible, permitting receptor-mediated uptake into target cells.

Example 5 Constructs With GILT1-87 Tag and Variants

In order to increase the likelihood of proper folding of an N-terminal GILT tag, a longer version of the GILT tag was generated that spans from IGF-II residues 1-87. The additional IGF-II sequence should still allow receptor binding, and also provide a more native folding environment for the core of the tag. The GILT1-87 tag was fused to positions 56 and 70 of GAA, resulting in GILT1-87-GAA56-952 and GILT1-87-GAA70-952, respectively. The DNA and amino acid sequences GILT1-87-GAA56-952 are shown below.

DNA sequence of GILT1-87-GAA56-952: ggtaccACACCAATGGGAATCCCAATGGGGAAGTCGATGCTGGTGCTTCT CACCTTCTTGGCCTTCGCCTCGTGCTGCATTGCTGCTTACCGCCCCAGTG AGACCCTGTGCGGCGGGGAGCTGGTGGACACCCTCCAGTTCGTCTGTGGG GACCGCGGCTTCTACTTCAGCAGGCCCGCAAGCCGTGTGAGCCGTCGCAG CCGTGGCATCGTTGAGGAGTGCTGTTTCCGCAGCTGTGACCTGGCCCTCC TGGAGACGTACTGTGCTACCCCCGCCAAGTCCGAGAGGGACGTGTCGACC CCTCCGACCGTGCTTCCGGACAACTTCCCCAGATACCCCGTGGGCggcgc gccgCACCAGCAGGGAGCCAGCAGACCAGGGCCCCGGGATGCCCAGGCAC ACCCCGGCCGTCCCAGAGCAGTGCCCACACAGTGCGACGTCCCCCCCAAC AGCCGCTTCGATTGCGCCCCTGACAAGGCCATCACCCAGGAACAGTGCGA GGCCCGCGGCTGCTGCTACATCCCTGCAAAGCAGGGGCTGCAGGGAGCCC AGATGGGGCAGCCCTGGTGCTTCTTCCCACCCAGCTACCCCAGCTACAAG CTGGAGAACCTGAGCTCCTCTGAAATGGGCTACACGGCCACCCTGACCCG TACCACCCCCACCTTCTTCCCCAAGGACATCCTGACCCTGCGGCTGGACG TGATGATGGAGACTGAGAACCGCCTCCACTTCACGATCAAAGATCCAGCT AACAGGCGCTACGAGGTGCCCTTGGAGACCCCGCGTGTCCACAGCCGGGC ACCGTCCCCACTCTACAGCGTGGAGTTCTCtGAGGAGCCCTTCGGGGTGA TCGTGCACCGGCAGCTGGACGGCCGCGTGCTGCTGAACACGACGGTGGCG CCCCTGTTCTTTGCGGACCAGTTCCTTCAGCTGTCCACCTCGCTGCCCTC GCAGTATATCACAGGCCTCGCCGAGCACCTCAGTCCCCTGATGCTCAGCA CCAGCTGGACCAGGATCACCCTGTGGAACCGGGACCTTGCGCCCACGCCC GGTGCGAACCTCTACGGGTCTCACCCTTTCTACCTGGCGCTGGAGGACGG CGGGTCGGCACACGGGGTGTTCCTGCTAAACAGCAATGCCATGGATGTGG TCCTGCAGCCGAGCCCTGCCCTTAGCTGGAGGTCGACAGGTGGGATCCTG GATGTCTACATCTTCCTGGGCCCAGAGCCCAAGAGCGTGGTGCAGCAGTA CCTGGACGTTGTGGGATACCCGTTCATGCCGCCATACTGGGGCCTGGGCT TCCACCTGTGCCGCTGGGGCTACTCCTCCACCGCTATCACCCGCCAGGTG GTGGAGAACATGACCAGGGCCCACTTCCCCCTGGACGTCCAATGGAACGA CCTGGACTACATGGACTCCCGGAGGGACTTCACGTTCAACAAGGATGGCT TCCGGGACTTCCCGGCCATGGTGCAGGAGCTGCACCAGGGCGGCCGGCGC TACATGATGATCGTGGATCCTGCCATCAGCAGCTCGGGCCCTGCCGGGAG CTACAGGCCCTACGACGAGGGTCTGCGGAGGGGGGTTTTCATCACCAACG AGACCGGCCAGCCGCTGATTGGGAAGGTATGGCCCGGGTCCACTGCCTTC CCCGACTTCACCAACCCCACAGCCCTGGCCTGGTGGGAGGACATGGTGGC TGAGTTCCATGACCAGGTGCCCTTCGACGGCATGTGGATTGACATGAACG AGCCTTCCAACTTCATCAGGGGCTCTGAGGACGGCTGCCCCAACAATGAG CTGGAGAACCCACCCTACGTGCCTGGGGTGGTTGGGGGGACCCTCCAGGC GGCAACCATCTGTGCCTCCAGCCACCAGTTTCTCTCCACACACTACAACC TGCACAACCTCTACGGCCTGACCGAAGCCATCGCCTCCCACAGGGCGCTG GTGAAGGCTCGGGGGACACGCCCATTTGTGATCTCCCGCTCGACCTTTGC TGGCCACGGCCGATACGCCGGCCACTGGACGGGGGACGTGTGGAGCTCCT GGGAGCAGCTCGCCTCCTCCGTGCCAGAAATCCTGCAGTTTAACCTGCTG GGGGTGCCTCTGGTCGGGGCCGACGTCTGCGGCTTCCTGGGCAACACCTC AGAGGAGCTGTGTGTGCGCTGGACCCAGCTGGGGGCCTTCTACCCCTTCA TGCGGAACCACAACAGCCTGCTCAGTCTGCCCCAGGAGCCGTACAGCTTC AGCGAGCCGGCCCAGCAGGCCATGAGGAAGGCCCTCACCCTGCGCTACGC ACTCCTCCCCCACCTCTACACGCTGTTCCACCAGGCCCACGTCGCGGGGG AGACCGTGGCCCGGCCCCTCTTCCTGGAGTTCCCCAAGGACTCTAGCACC TGGACTGTGGACCACCAGCTCCTGTGGGGGGAGGCCCTGCTCATCACCCC AGTGCTCCAGGCCGGGAAGGCCGAAGTGACTGGCTACTTCCCCTTGGGCA CATGGTACGACCTGCAGACGGTGCCAATAGAGGCCCTTGGCAGCCTCCCA CCCCCACCTGCAGCTCCCCGTGAGCCAGCCATCCACAGCGAGGGGCAGTG GGTGACGCTGCCGGCCCCCCTGGACACCATCAACGTCCACCTCCGGGCTG GGTACATCATCCCCCTGCAGGGCCCTGGCCTCACAACCACAGAGTCCCGC CAGCAGCCCATGGCCCTGGCTGTGGCCCTGACCAAGGGTGGAGAGGCCCG AGGGGAGCTGTTCTGGGACGATGGAGAGAGCCTGGAAGTGCTGGAGCGAG GGGCCTACACACAGGTCATCTTCCTGGCCAGGAATAACACGATCGTGAAT GAGCTGGTACGTGTGACCAGTGAGGGAGCTGGCCTGCAGCTGCAGAAGGT GACTGTCCTGGGCGTGGCCACGGCGCCCCAGCAGGTCCTCTCCAACGGTG TCCCTGTCTCCAACTTCACCTACAGCCCCGACACCAAGGTCCTGGACATC TGTGTCTCGCTGTTGATGGGAGAGCAGTTTCTCGTCAGCTGGTGTTAGgc taga

Amino acid sequence of GILT1-87-GAA56-952: MGIPMGKSMLVLLTFLAFASCCIAAYRPSETLCGGELVDTLQFVCGDRGF YFSRPASRVSRRSRGIVEECCFRSCDLALLETYCATPAKSERDVSTPPTV LPDNFPRYPVGGAPHQQGASRPGPRDAQAHPGRPRAVPTQCDVPPNSRFD CAPDKAITQEQCEARGCCYIPAKQGLQGAQMGQPWCFFPPSYPSYKLENL SSSEMGYTATLTRTTPTFFPKDILTLRLDVMMETENRLHFTIKDPANRRY EVPLETPRVHSRAPSPLYSVEFSEEPFGVIVHRQLDGRVLLNTTVAPLFF ADQFLQLSTSLPSQYITGLAEHLSPLMLSTSWTRITLWNRDLAPTPGANL YGSHPFYLALEDGGSAHGVFLLNSNAMDVVLQPSPALSWRSTGGILDVYI FLGPEPKSVVQQYLDVVGYPFMPPYWGLGFHLCRWGYSSTAITRQVVENM TRAHFPLDVQWNDLDYMDSRRDFTFNKDGFRDFPAMVQELHQGGRRYMMI VDPAISSSGPAGSYRPYDEGLRRGVFITNETGQPLIGKVWPGSTAFPDFT NPTALAWWEDMVAEFHDQVPFDGMWIDMNEPSNFIRGSEDGCPNNELENP PYVPGVVGGTLQAATICASSHQFLSTHYNLHNLYGLTEAIASHRALVKAR GTRPFVISRSTFAGHGRYAGHWTGDVWSSWEQLASSVPEILQFNLLGVPL VGADVCGFLGNTSEELCVRWTQLGAFYPFMRNHNSLLSLPQEPYSFSEPA QQAMRKLTLRYALLPHLYTLFHQAHVAGETVARPLFLEFPKDSSTWTVDH QLLWGEALLITPVLQAGKAEVTGYFPLGTWYDLQTVPIEALGSLPPPPAA PREPAIHSEGQWVTLPAPLDTINVHLRAGYIIPLQGPGLTTTESRQQPMA LAVALTKGGEARGELFWDDGESLEVLERGAYTQVIFLARNNTIVNELVRV TSEGAGLQLQKVTVLGVATAPQQVLSNGVPVSNFTYSPDTKVLDICVSLL MGEQFLVSWC.

DNA sequence of GILT1-87-GAA70-952: ggtaccACACCAATGGGAATCCCAATGGGGAAGTCGATGCTGGTGCTTCT CACCTTCTTGGCCTTCGCCTCGTGCTGCATTGCTGCTTACCGCCCCAGTG AGACCCTGTGCGGCGGGGAGCTGGTGGACACCCTCCAGTTCGTCTGTGGG GACCGCGGCTTCTACTTCAGCAGGCCCGCAAGCCGTGTGAGCCGTCGCAG CCGTGGCATCGTTGAGGAGTGCTGTTTCCGCAGCTGTGACCTGGCCCTCC TGGAGACGTACTGTGCTACCCCCGCCAAGTCCGAGAGGGACGTGTCGACC CCTCCGACCGTGCTTCCGGACAACTTCCCCAGATACCCCGTGGGCggcgc gccgGCACACCCCGGCCGTCCCAGAGCAGTGCCCACACAGTGCGACGTCC CCCCCAACAGCCGCTTCGATTGCGCCCCTGACAAGGCCATCACCCAGGAA CAGTGCGAGGCCCGCGGCTGCTGCTACATCCCTGCAAAGCAGGGGCTGCA GGGAGCCCAGATGGGGCAGCCCTGGTGCTTCTTCCCACCCAGCTACCCCA GCTACAAGCTGGAGAACCTGAGCTCCTCTGAAATGGGCTACACGGCCACC CTGACCCGTACCACCCCCACCTTCTTCCCCAAGGACATCCTGACCCTGCG GCTGGACGTGATGATGGAGACTGAGAACCGCCTCCACTTCACGATCAAAG ATCCAGCTAACAGGCGCTACGAGGTGCCCTTGGAGACCCCGCGTGTCCAC AGCCGGGCACCGTCCCCACTCTACAGCGTGGAGTTCTCtGAGGAGCCCTT CGGGGTGATCGTGCACCGGCAGCTGGACGGCCGCGTGCTGCTGAACACGA CGGTGGCGCCCCTGTTCTTTGCGGACCAGTTCCTTCAGCTGTCCACCTCG CTGCCCTCGCAGTATATCACAGGCCTCGCCGAGCACCTCAGTCCCCTGAT GCTCAGCACCAGCTGGACCAGGATCACCCTGTGGAACCGGGACCTTGCGC CCACGCCCGGTGCGAACCTCTACGGGTCTCACCCTTTCTACCTGGCGCTG GAGGACGGCGGGTCGGCACACGGGGTGTTCCTGCTAAACAGCAATGCCAT GGATGTGGTCCTGCAGCCGAGCCCTGCCCTTAGCTGGAGGTCGACAGGTG GGATCCTGGATGTCTACATCTTCCTGGGCCCAGAGCCCAAGAGCGTGGTG CAGCAGTACCTGGACGTTGTGGGATACCCGTTCATGCCGCCATACTGGGG CCTGGGCTTCCACCTGTGCCGCTGGGGCTACTCCTCCACCGCTATCACCC GCCAGGTGGTGGAGAACATGACCAGGGCCCACTTCCCCCTGGACGTCCAA TGGAACGACCTGGACTACATGGACTCCCGGAGGGACTTCACGTTCAACAA GGATGGCTTCCGGGACTTCCCGGCCATGGTGCAGGAGCTGCACCAGGGCG GCCGGCGCTACATGATGATCGTGGATCCTGCCATCAGCAGCTCGGGCCCT GCCGGGAGCTACAGGCCCTACGACGAGGGTCTGCGGAGGGGGGTTTTCAT CACCAACGAGACCGGCCAGCCGCTGATTGGGAAGGTATGGCCCGGGTCCA CTGCCTTCCCCGACTTCACCAACCCCACAGCCCTGGCCTGGTGGGAGGAC ATGGTGGCTGAGTTCCATGACCAGGTGCCCTTCGACGGCATGTGGATTGA CATGAACGAGCCTTCCAACTTCATCAGGGGCTCTGAGGACGGCTGCCCCA ACAATGAGCTGGAGAACCCACCCTACGTGCCTGGGGTGGTTGGGGGGACC CTCCAGGCGGCAACCATCTGTGCCTCCAGCCACCAGTTTCTCTCCACACA CTACAACCTGCACAACCTCTACGGCCTGACCGAAGCCATCGCCTCCCACA GGGCGCTGGTGAAGGCTCGGGGGACACGCCCATTTGTGATCTCCCGCTCG ACCTTTGCTGGCCACGGCCGATACGCCGGCCACTGGACGGGGGACGTGTG GAGCTCCTGGGAGCAGCTCGCCTCCTCCGTGCCAGAAATCCTGCAGTTTA ACCTGCTGGGGGTGCCTCTGGTCGGGGCCGACGTCTGCGGCTTCCTGGGC AACACCTCAGAGGAGCTGTGTGTGCGCTGGACCCAGCTGGGGGCCTTCTA CCCCTTCATGCGGAACCACAACAGCCTGCTCAGTCTGCCCCAGGAGCCGT ACAGCTTCAGCGAGCCGGCCCAGCAGGCCATGAGGAAGGCCCTCACCCTG CGCTACGCACTCCTCCCCCACCTCTACACGCTGTTCCACCAGGCCCACGT CGCGGGGGAGACCGTGGCCCGGCCCCTCTTCCTGGAGTTCCCCAAGGACT CTAGCACCTGGACTGTGGACCACCAGCTCCTGTGGGGGGAGGCCCTGCTC ATCACCCCAGTGCTCCAGGCCGGGAAGGCCGAAGTGACTGGCTACTTCCC CTTGGGCACATGGTACGACCTGCAGACGGTGCCAATAGAGGCCCTTGGCA GCCTCCCACCCCCACCTGCAGCTCCCCGTGAGCCAGCCATCCACAGCGAG GGGCAGTGGGTGACGCTGCCGGCCCCCCTGGACACCATCAACGTCCACCT CCGGGCTGGGTACATCATCCCCCTGCAGGGCCCTGGCCTCACAACCACAG AGTCCCGCCAGCAGCCCATGGCCCTGGCTGTGGCCCTGACCAAGGGTGGA GAGGCCCGAGGGGAGCTGTTCTGGGACGATGGAGAGAGCCTGGAAGTGCT GGAGCGAGGGGCCTACACACAGGTCATCTTCCTGGCCAGGAATAACACGA TCGTGAATGAGCTGGTACGTGTGACCAGTGAGGGAGCTGGCCTGCAGCTG CAGAAGGTGACTGTCCTGGGCGTGGCCACGGCGCCCCAGCAGGTCCTCTC CAACGGTGTCCCTGTCTCCAACTTCACCTACAGCCCCGACACCAAGGTCC TGGACATCTGTGTCTCGCTGTTGATGGGAGAGCAGTTTCTCGTCAGCTGG TGTTAGtctaga

Amino acid sequence of GILT1-87-GAA70-952: MGIPMGKSMLVLLTFLAFASCCIAAYRPSETLCGGELVDTLQFVCGDRGF YFSRPASRVSRRSRGIVEECCFRSCDLALLETYCATPAKSERDVSTPPTV LPDNFPRYPVGGAPAHPGRPRAVPTQCDVPPNSRFDCAPDKAITQEQCEA RGCCYIPAKQGLQGAQMGQPWCFFPPSYPSYKLENLSSSEMGYTATLTRT TPTFFPKDILTLRLDVMMETENRLHFTIKDPANRRYEVPLETPRVHSRAP SPLYSVEFSEEPFGVIVHRQLDGRVLLNTTVAPLFFADQFLQLSTSLPSQ YITGLAEHLSPLMLSTSWTRITLWNRDLAPTPGANLYGSHPFYLALEDGG SAHGVFLLNSNAMDVVLQPSPALSWRSTGGILDVYIFLGPEPKSVVQQYL DVVGYPFMPPYWGLGFHLCRWGYSSTAITRQVVENMTRAHFPLDVQWNDL DYMDSRRDFTFNKDGFRDFPAMVQELHQGGRRYMMIVDPAISSSGPAGSY RPYDEGLRRGVFITNETGQPLIGKVWPGSTAFPDFTNPTALAWWEDMVAE FHDQVPFDGMWIDMNEPSNFIRGSEDGCPNNELENPPYVPGVVGGTLQAA TICASSHQFLSTHYNLHNLYGLTEAIASHRALVKARGTRPRVISRSTFAG HGRYAGHWTGDVWSSWEQLASSVPEILQFNLLGVPLVGADVCGFLGNTSE ELCVRWTQLGAFYPFMRNHNSLLSLPQEPYSFSEPAQQAMRKALTLRYAL LPHLYTLFHQAHVAGETVARPLFLEFPKDSSTWTVDHQLLWGEALLITPV LQAGKAEVTGYFPLGTWYDLQTVPIEALGSLPPPPAAPREPAIHSEGQWV TLPAPLDTINVHLRAGYIIPLQGPGLTTTESRQQPMALAVALTKGGEARG ELFWDDGESLEVLERGAYTQVIFLARNNTIVNELVRVTSEGAGLQLQKVT VLGVATAPQQVLSNGVPVSNFTYSPDTKVLDICVSLLMGEQFLVSWC.

The 5′ Asp718 site was cloned into the Asp718 site of pCEP4, and the 3′Xba site was blunted with Klenow and cloned into the HindIII site of pCEP4, resulting in pCEP-GILT1-87-GAA56-952 and pCEP-GILT1-87-GAA70-952, respectively. The constructs also contain Gly-Ala-Pro linker sequence (AscI site). These constructs express proteins with GAA enzymatic activity.

In addition, the modification R68A was introduced to the GILT1-87 tag to remove a potential proteolysis site within the GILT tag (GILT1-87-R68A). The DNA (SEQ ID NO:_) and amino acid (SEQ ID NO:_) sequences of GILT1-87-R68A are shown below (the mutated sequences are underlined).

DNA sequence of GILT1-87-R68A (SEQ ID NO:_) GGTACCACACCAATGGGAATCCCAATGGGGAAGTCGATGCTGGTGCTTCT CACCTTCTTGGCCTTCGCCTCGTGCTGCATTGCTGCTTACCGCCCCAGTG AGACCCTGTGCGGCGGGGAGCTGGTGGACACCCTCCAGTTCGTCTGTGGG GACCGCGGCTTCTACTTCAGCAGGCCCGCAAGCCGTGTGAGCCGTCGCAG CCGTGGCATCGTTGAGGAGTGCTGTTTCCGCAGCTGTGACCTGGCCCTCC TGGAGACGTACTGTGCTACCCCCGCCAAGTCCGAGGCGGACGTGTCGACC CCTCCGACCGTGCTTCCGGACAACTTCCCCAGATACCCCGTGGGCGGCGC GCCG

Amino acid sequence of GILT1-87-R68A (SEQ ID NO:_) MGIPMGKSMLVLLTFLAFASCCIAAYRPSETLCGGELVDTLQFVCGDRGF YFSRPASRVSRRSRGIVEECCFRSCDLALLETYCATPAKSEADVSTPPTV LPDNFPRYPVGGAP Fusion of this tag to GAA positions 56 and 70 resulted in pCEP-GILT1-87-R68A-GAA56-952 and pCEP-GILT1-87-R68A-GAA70-952. These constructs expressed proteins with GAA enzymatic activity.

In addition, point mutations were also instroduced to substitute three Ser/Thr residues within the GILT1-87 tag to remove glycosylation sites (ΔGS) (GILT1-87-ΔGS). The DNA (SEQ ID NO:_) and amino acid (SEQ ID NO:_) sequences of GILT1-87-ΔGS are shown below (the mutated sequences are underlined).

DNA sequence of GILT1-87-ΔGS. GGTACCACACCAATGGGAATCCCAATGGGGAAGTCGATGCTGGTGCTTCT CACCTTCTTGGCCTTCGCCTCGTGCTGCATTGCTGCTTACCGCCCCAGTG AGACCCTGTGCGGCGGGGAGCTGGTGGACACCCTCCAGTTCGTCTGTGGG GACCGCGGCTTCTACTTCAGCAGGCCCGCAAGCCGTGTGAGCCGTCGCAG CCGTGGCATCGTTGAGGAGTGCTGTTTCCGCAGCTGTGACCTGGCCCTCC TGGAGACGTACTGTGCTACCCCCGCCAAGTCCGAGAGGGACGTGGCGGCC CCTCCGGCCGTGCTTCCGGACAACTTCCCCAGATACCCCGTGGGCGGCGC GCCG

Amino acid sequence of GILT1-87-ΔGS. MGIPMGKSMLVLLTFLAFASCCIAAYRPSETLCGGELVDTLQFVCGDRGF YFSRPASRVSRRSRGIVEECCFRSCDLALLETYCATPAKSERDVAAPPAV LPDNFPRYPVGGAP This modified GILT tag was fused to position 70 of GAA yielding pCEP-GILT1-87-ΔGS-GAA70-952. This construct expressed protein with GAA enzymatic activity.

In addition, GILT tag incorporating both R68A and ΔGS modifications was generated (GILT1-87-R68A-ΔGS). The DNA (SEQ ID NO:_) and amino acid (SEQ ID NO:_) sequences of GILT1-87-R68A-ΔGS are shown below (the mutated sequences are underlined).

DNA sequence of GILT1-87-R68A-ΔGS (SEQ ID NO:_). GGTACCACACCAATGGGAATCCCAATGGGGAAGTCGATGCTGGTGCTTCT CACCTTCTTGGCCTTCGCCTCGTGCTGCATTGCTGCTTACCGCCCCAGTG AGACCCTGTGCGGCGGGGAGCTGGTGGACACCCTCCAGTTCGTCTGTGGG GACCGCGGCTTCTACTTCAGCAGGCCCGCAAGCCGTGTGAGCCGTCGCAG CCGTGGCATCGTTGAGGAGTGCTGTTTCCGCAGCTGTGACCTGGCCCTCC TGGAGACGTACTGTGCTACCCCCGCCAAGTCCGAGGCGGACGTGGCGGCC CCTCCGGCCGTGCTTCCGGACAACTTCCCCAGATACCCCGTGGGCGGCGC GCCG

Amino acid sequence of GILT1-87-R68A-ΔGS (SEQ ID NO:_). MGIPMGKSMLVLLTFLAFASCCIAAYRPSETLCGGELVDTLQFVCGDRGF YFSRPASRVSRRSRGIVEECCFRSCDLALLETYCATPAKSEADVAAPPAV LPDNFPRYPVGGAP The modified GILT1-87-R68A-ΔGS was used to generate constructs pCEP-GILT1-87-R68AΔGS-GAA56-952 and pCEP-GILT1-87-R68AΔGS-GAA70-952. Both constructs expressed protein with GAA enzymatic activity.

Western blots were performed on the above proteins with GILT tag fused at GAA position 56. As illustrated in FIG. 6, the precursor proteins are full length and contain the IGF-II tag. The ΔGS mutation appears to produce a protein with a slightly faster mobility, consistent with absence of a carbohydrate moiety.

Example 6 Additional Constructs With Longer and Modified GILT Tags

In an effort to provide a native folding environment for an IGF-II tag, a precursor form of IGF-II including amino acid 8-156 was used as an internal tag fused at GAA position 791. In addition, mutations E67A and D69S were made in IGF-II sequence to introduce a P2/P1 proteolysis processing site in order to promote cleavage downstream of position 87 within the IGF-II tag. The resulting construct, pCEP-GAA-791IGF2-P2/P1 yields a protein with GAA enzymatic activity. The DNA and amino acid sequences of pCEP-GAA-791IGF2-P2/P1 are shown below.

DNA sequence of pCEP-GAA-791IGF2-P2/P1 ATGGGAGTGAGGCACCCGCCCTGCTCCCACCGGCTCCTGGCCGTCTGCGC CCTCGTGTCCTTGGCAACCGCTGCACTCCTGGGGCACATCCTACTCCATG ATTTCCTGCTGGTTCCCCGAGAGCTGAGTGGCTCCTCCCCAGTCCTGGAG GAGACTCACCCAGCTCACCAGCAGGGAGCCAGCAGACCAGGGCCCCGGGA TGCCCAGGCACACCCCGGCCGTCCCAGAGCAGTGCCCACACAGTGCGACG TCCCCCCCAACAGCCGCTTCGATTGCGCCCCTGACAAGGCCATCACCCAG GAACAGTGCGAGGCCCGCGGCTGCTGCTACATCCCTGCAAAGCAGGGGCT GCAGGGAGCCCAGATGGGGCAGCCCTGGTGCTTCTTCCCACCCAGCTACC CCAGCTACAAGCTGGAGAACCTGAGCTCCTCTGAAATGGGCTACACGGCC ACCCTGACCCGTACCACCCCCACCTTCTTCCCCAAGGACATCCTGACCCT GCGGCTGGACGTGATGATGGAGACTGAGAACCGCCTCCACTTCACGATCA AAGATCCAGCTAACAGGCGCTACGAGGTGCCCTTGGAGACCCCGCGTGTC CACAGCCGGGCACCGTCCCCACTCTACAGCGTGGAGTTCTCtGAGGAGCC CTTCGGGGTGATCGTGCACCGGCAGCTGGACGGCCGCGTGCTGCTGAACA CGACGGTGGCGCCCCTGTTCTTTGCGGACCAGTTCCTTCAGCTGTCCACC TCGCTGCCCTCGCAGTATATCACAGGCCTCGCCGAGCACCTCAGTCCCCT GATGCTCAGCACCAGCTGGACCAGGATCACCCTGTGGAACCGGGACCTTG CGCCCACGCCCGGTGCGAACCTCTACGGGTCTCACCCTTTCTACCTGGCG CTGGAGGACGGCGGGTCGGCACACGGGGTGTTCCTGCTAAACAGCAATGC CATGGATGTGGTCCTGCAGCCGAGCCCTGCCCTTAGCTGGAGGTCGACAG GTGGGATCCTGGATGTCTACATCTTCCTGGGCCCAGAGCCCAAGAGCGTG GTGCAGCAGTACCTGGACGTTGTGGGATACCCGTTCATGCCGCCATACTG GGGCCTGGGCTTCCACCTGTGCCGCTGGGGCTACTCCTCCACCGCTATCA CCCGCCAGGTGGTGGAGAACATGACCAGGGCCCACTTCCCCCTGGACGTC CAATGGAACGACCTGGACTACATGGACTCCCGGAGGGACTTCACGTTCAA CAAGGATGGCTTCCGGGACTTCCCGGCCATGGTGCAGGAGCTGCACCAGG GCGGCCGGCGCTACATGATGATCGTGGATCCTGCCATCAGCAGCTCGGGC CCTGCCGGGAGCTACAGGCCCTACGACGAGGGTCTGCGGAGGGGGGTTTT CATCACCAACGAGACCGGCCAGCCGCTGATTGGGAAGGTATGGCCCGGGT CCACTGCCTTCCCCGACTTCACCAACCCCACAGCCCTGGCCTGGTGGGAG GACATGGTGGCTGAGTTCCATGACCAGGTGCCCTTCGACGGCATGTGGAT TGACATGAACGAGCCTTCCAACTTCATCAGGGGCTCTGAGGACGGCTGCC CCAACAATGAGCTGGAGAACCCACCCTACGTGCCTGGGGTGGTTGGGGGG ACCCTCCAGGCGGCAACCATCTGTGCCTCCAGCCACCAGTTTCTCTCCAC ACACTACAACCTGCACAACCTCTACGGCCTGACCGAAGCCATCGCCTCCC ACAGGGCGCTGGTGAAGGCTCGGGGGACACGCCCATTTGTGATCTCCCGC TCGACCTTTGCTGGCCACGGCCGATACGCCGGCCACTGGACGGGGGACGT GTGGAGCTCCTGGGAGCAGCTCGCCTCCTCCGTGCCAGAAATCCTGCAGT TTAACCTGCTGGGGGTGCCTCTGGTCGGGGCCGACGTCTGCGGCTTCCTG GGCAACACCTCAGAGGAGCTGTGTGTGCGCTGGACCCAGCTGGGGGCCTT CTACCCCTTCATGCGGAACCACAACAGCCTGCTCAGTCTGCCCCAGGAGC CGTACAGCTTCAGCGAGCCGGCCCAGCAGGCCATGAGGAAGGCCCTCACC CTGCGCTACGCACTCCTCCCCCACCTCTACACGCTGTTCCACCAGGCCCA CGTCGCGGGGGAGACCGTGGCCCGGCCCCTCTTCCTGGAGTTCCCCAAGG ACTCTAGCACCTGGACTGTGGACCACCAGCTCCTGTGGGGGGAGGCCCTG CTCATCACCCCAGTGCTCCAGGCCGGGAAGGCCGAAGTGACTGGCTACTT CCCCTTGGGCACATGGTACGACCTGCAGACGGTGCCAATAGAGGCCCTTG GCAGCCTCCCACCCCCACCTggcgcgccgCTGTGCGGCGGGGAGCTGGTG GACACCCTCCAGTTCGTCTGTGGGGACCGCGGCTTCTACTTCAGCAGGCC CGCAAGCCGTGTGAGCCGTCGCAGCCGTGGCATCGTTGAGGAGTGCTGTT TCCGCAGCTGTGACCTGGCCCTCCTGGAGACGTACTGTGCTACCCCCGCC AAGTCCGcGAGGtcCGTGTCGACCCCTCCGACCGTGCTTCCGGACAACTT CCCCAGATACCCCGTGGGCAAGTTCTTCCAATATGACACCTGGAAGCAGT CCACCCAGCGCCTGCGCAGGGGCCTGCCTGCCCTCCTGCGTGCCCGCCGG GGTCACGTGCTCGCCAAGGAGCTCGAGGCGTTCAGGGAGGCCAAACGTCA CCGTCCCCTGATTGCTCTACCCACCCAAGACCCCGCCCACGGGGGCGCCC CCCAGAGATGGCCAGCAATCGGAAGggcgcgccgGCAGCTCCCCGTGAGC CAGCCATCCACAGCGAGGGGCAGTGGGTGACGCTGCCGGCCCCCCTGGAC ACCATCAACGTCCACCTCCGGGCTGGGTACATCATCCCCCTGCAGGGCCC TGGCCTCACAACCACAGAGTCCCGCCAGCAGCCCATGGCCCTGGCTGTGG CCCTGACCAAGGGTGGAGAGGCCCGAGGGGAGCTGTTCTGGGACGATGGA GAGAGCCTGGAAGTGCTGGAGCGAGGGGCCTACACACAGGTCATCTTCCT GGCCAGGAATAACACGATCGTGAATGAGCTGGTACGTGTGACCAGTGAGG GAGCTGGCCTGCAGCTGCAGAAGGTGACTGTCCTGGGCGTGGCCACGGCG CCCCAGCAGGTCCTCTCCAACGGTGTCCCTGTCTCCAACTTCACCTACAG CCCCGACACCAAGGTCCTGGACATCTGTGTCTCGCTGTTGATGGGAGAGC AGTTTCTCGTCAGCTGGTGTTAG

Amino acid sequence of pCEP-GAA-791IGF2-P2/P1 MGVRHPPCSHRLLAVCALVSLATAALLGHILLHDFLLVPRELSGSSPVLE ETHPAHQQGASRPGPRDAQAHPGRPRAVPTQCDVPPNSRFDCAPDKAITQ EQCEARGCCYIPAKQGLQGAQMGQPWCFFPPSYPSYKLENLSSSEMGYTA TLTRTTPTFFPKDILTLRLDVMMETENRLHFTIKDPANRRYEVPLETPRV HSRAPSPLYSVEFSEEPFGVIVHRQLDGRVLLNTTVAPLFFADQFLQLST SLPSQYITGLAEHLSPLMLSTSWTRITLWNRDLAPTPGANLYGSHPFYLA LEDGGSAHGVFLLNSNAMDVVLQPSPALSWRSTGGILDVYIFLGPEPKSV VQQYLDVVGYPFMPPYWGLGFHLCRWGYSSTAITRQVVENMTRAHFPLDV QWNDLDYMDSRRDFTFNKDGFRDFPAMVQELHQGRRYMMIVDPAISSSGP AGSYRPYDEGLRRGVGITNETGQPLIGKVWPGSTAFPDFTNPTALAWWED MVAEFHDQVPFDGMWIDMNEPSNFIRGSEDGCPNNELENPPYVPGVVGGT LQAATICASSHQFLSTHYNLHNLYGLTEAIASHRALVKARGTRPFVISRS TFAGHGRYAGHWTGDVWSSWEQLASSVPEILQFNLLGVPLVGADVCGFLG NTSEELCVRWTQLGAFYPFMRNHNSLLSLPQEPYSFSEPAQQAMRKALTL RYALLPHLYTLFHQAHVAGETVARPLFLEFPKDSSTWTVDHQLLWGEALL ITPVLQAGKAEVTGYFPLGTWYDLQTVPIEALGSLPPPPGAPLCGGELVD TLQFVCGDRGFYFSPASRVSRRSRGIVEECCFRSCDLALLETYCATPAKS ARSVSTPPTVLPDNFPRYPVGKFFQYDTWKQSTQRLRRGLPALLRARRGH VLAKELEAFREAKRHRPLIALPTQDPAHGGAPPEMASNRKGAPAAPREPA IHSEGQWVTLPAPLDTINVHLRAGYIIPLQGPGLTTTESRQQPMALAVAL TKGGEARGELFWDDGESLEVLERGAYTQVIFLARNNTIVNELVRVTSEGA GLQLQKVTVLGVATAPQQVLSNGVPVSNFTYSPDTKVLDICVSLLMGEQF LVSWC.

To further improve presentation and/or folding of the GILT tag fused at the N-terminus (e.g., position 70), a spacer with the sequence Gly-Gly-Gly-Gly-Gly-Pro was inserted between an N-terminal GILTΔ2-7 tag and the GAA fusion point position 70, yielding pCEP-GILTΔ2-7-spcr1-GAA70-952. This construct expressed protein with GAA enzymatic activity.

Example 7 GAA Construct With AscI Restriction Site Insertion

Constructs were made to include an insertion of Gly-Ala-Pro sequence (an AscI restriction site) within GAA region amino acid residues 783-791 according to standard molecular techniques. As indicated in Table 7, the insertion of AscI restriction site increases transient GAA enzyme expression levels. This insertion possibly could cause a shift of the enzyme to a high-affinity form. Normally the precursor GAA matures into the high-affinity GAA form after cleavage in the 783-791 boundary region (Moreland et al., 2004). It was reported that, after cleavage, the N-terminal region and the C-terminal region remain associated (Moreland et al., 2004). TABLE 7 AscI restriction site insertion increases transient expression. Average transient Transient expression from two Enzyme Expression experiments U/ml GAA ++ 20 GAA-779Asc ++ 11 GAA-787Asc +++ 199 GAA-791Asc +++ 243 GAA-796Asc ++-+++ 88 GAA-881Asc + 5 GAA-920Asc + 5

In order to determine if insertion of the three residues promotes cleavage of the precursor GAA, Western blot analysis was performed comparing wild-type GAA and GAA-791Asc proteins using anti-GAA polyclonal antibody. As indicated in FIG. 7, GAA-791Asc migrated with a similar mobility to that of wild-type GAA, indicating that the insertion of the three residues does not promote proteolysis.

An alternate explanation for the increase in enzyme activity is that insertion of the residues within the domain boundary allows a conformational shift to the high-affinity form without cleavage of the two domains. This can be tested using affinity chromatography and comparing the binding affinity of GAA-791Asc and wild-type GAA on a Superdex 200 column as described in Moreland et al., 2004.

In addition, construct pCEP-GILTΔ2-7-GAA70-952-791Asc was made to combine the 791AscI site insertion with an N-terminal GILT tag at position 69.

Example 8 Internal GILT Tags With Engineered Proteolysis Sites

In order to generate an active internal GILT tag, experiments were designed to place the tag within the GAA 779-796 region engineered with a Factor X restriction protease site downstream of the tag. Treatment of the expressed protein with Factor Xa would release the C-terminal portion of GAA and potentially reveal an exposed and active GILT tag.

Accordingly, GILT tags with a downstream Factor X protease site were placed within GAA at positions 787, 779, and 796. All three resulting proteins had GAA enzymatic activity. In Western analysis, as illustrated in FIG. 8, all three proteins contained the GILT tag as probed by an anti-IGF-II antibody. All three protein preparations contained a band with a relative mobility (M_(r) 120,000-140,000) consistent with the presence of a full-length precursor. All three protein preparations also contained a faster migrating intermediate band (M_(r) 85,000-100,000) which retained the GILT tag. Upon treatment of the proteins with Factor Xa, most of the full-length band is removed, and the intermediate band is shifted slightly lower. The GILT tag is retained in the Xa-treated intermediate bands.

It is possible that the presence of the GILTXa tags within the GAA sequence promotes proteolysis at a site downstream of the Factor X site. Position 816 has been reported as a site of GAA processing upon maturation.

A possible GAA C-terminal processing model is illustrated in FIG. 8.

Example 9 Human/Mouse GAA Hybrids

In order to improve the folding of GILT tag, chimeric proteins composed of N-terminal human GAA and C-terminal mouse GAA were constructed with fusion points at amino acid positions 791, 796, 816, 881, and 920 of human GAA. An AscI restriction site including sequence Gly-Ala-Pro was introduced at the point of fusion.

Specifically, chimeric human/mouse GAA proteins were made with C-terminal portions of human GAA replaced with corresponding mouse GAA C-terminal sequence. DNA cassettes were constructed by fusing the human and mouse portions at a common linker sequence, ggcgcgccg, that contains a unique AscI site and encodes the sequence Gly-Ala-Pro (GAP). Mouse portions of the GAA hybrid were generated by PCR with the primers listed below that contain the 5′ AscI site for fusion to the N-terminal human GAA sequence and a 3′ NotI site for cloning into the NotI site of the pCEP vector. TABLE 8 Human/Mouse GAA hybrids. Mouse GAA Mouse portions GAA (Human portions Human GAA Linker num- (Mouse Forward Primer Reverse Primer portions Sequence bering) numbering) With 5′ AscI site With 3′ NotI site 1-790 GAP 791-952 792-953 gcggcgcgccgGCTTCATCCTTCAGATCTGC ggcggccgcCTAGGACCAGCTGATTTGAAAC 1-796 GAP 797-952 798-953 gcggcgcgccgGCTGTCCAGAGCAAGGGGC ggcggccgcCTAGGACCAGCTGATTTGAAAC 1-816 GAP 817-952 818-953 gcggcgcgccgCACCTGAGGGAGGGGTACATC ggcggccgcCTAGGACCAGCTGATTTGAAAC 1-881 GAP 882-952 883-953 gcggcgcgccgAACAATACCATTGTGAACAAG ggcggccgcCTAGGACCAGCTGATTTGAAAC 1-920 GAP 921-952 922-953 gcggcgcgccgATCCCTGTCTCCAATTTCACC ggcggccgcCTAGGACCAGCTGATTTGAAAC

Mouse GAA nucleotide sequence: ATGAATATACGGAAGCCCCTCTGTTCGAACTCCGTGGTTGGGGCCTGCAC CCTTATCTCTCTGACTACAGCGGTCATCCTGGGTCATCTCATGCTTCGGG AGTTAATGCTGCTTCCCCAAGACCTTCATGAGTCCTCTTCAGGACTGTGG AAGACGTACCGACCTCACCACCAGGAAGGTTACAAGCCAGGGCCTCTGCA CATCCAGGAGCAGACTGAACAGCCCAAAGAAGCACCCACACAGTGTGATG TGCCCCCCAGCAGCCGCTTTGACTGTGCCCCCGACAAAGGCATCTCACAG GAGCAATGCGAGGCCCGCGGCTGCTGCTATGTCCCAGCAGGGCAGGTGCT GAAGGAGCCGCAGATAGGGCAGCCCTGGTGTTTCTTCCCTCCCAGCTACC CAAGCTACCGTCTAGAGAACCTGAGCTCTACAGAGTCGGGGTACACAGCC ACCCTGACCCGTACCAGCCCGACCTTCTTCCCAAAGGATGTGCTGACCTT ACAGCTGGAGGTGCTGATGGAGACAGACAGCCGCCTCCACTTCAAGATCA AAGATCCTGCTAGTAAGCGCTACGAAGTGCCCCTGGAGACCCCACGTGTG CTGAGCCAGGCACCATCCCCACTTTACAGCGTGGAATTCTCAGAGGAACC CTTTGGAGTGATCGTTCGTAGGAAGCTTGGTGGCCGAGTGTTGCTGAACA CAACCGTGGCCCCCCTGTTCTTCGCTGACCAGTTCCTGCAGCTGTCCACT TCCCTGCCCTCCCAGCACATCACAGGCCTGGGGGAACACCTCAGCCCACT CATGCTCAGCACCGACTGGGCTCGTATCACCCTCTGGAACCGGGACACAC CACCCTCGCAAGGTACCAACCTCTACGGGTCACATCCTTTCTACCTGGCA CTGGAGGACGGTGGCTTGGCTCACGGTGTCTTCTTGCTAAACAGCAATGC CATGGATGTCATCCTGCAACCCAGCCCAGCCCTAACCTGGAGGTCAACGG GCGGGATCCTGGATGTGTATGTGTTCCTAGGCCCAGAGCCCAAGAGCGTT GTGCAACAATACCTGGATGTTGTGGGATACCCCTTCATGCCTCCATACTG GGGCCTCGGCTTCCACCTCTGCCGCTGGGGCTACTCCTCGACCGCCATTG TCCGCCAGGTAGTGGAGAACATGACCAGGACACACTTCCCGCTGGATGTG CAATGGAATGACCTGGACTACATGGACGCCCGAAGAGACTTCACCTTCAA CCAGGACAGCTTTGCCGACTTCCCAGACATGGTGCGGGAGCTGCACCAGG GTGGCCGGCGCTACATGATGATCGTGGACCCTGCCATCAGCAGCGCAGGC CCTGCTGGGAGTTACAGGCCCTACGACGAGGGTCTGCGGAGGGGTGTGTT CATCACCAACGAGACTGGGCAGCCGCTGATTGGGAAGGTTTGGCCCGGAA CCACCGCCTTCCCTGATTTCACCAACCCTGAGACCCTTGACTGGTGGCAG GACATGGTGTCTGAGTTCCACGCCCAGGTGCCCTTCGATGGCATGTGGCT CGACATGAACGAACCGTCCAACTTCGTTAGAGGCTCTCAGCAGGGCTGCC CCAACAATGAACTGGAGAACCCCCCCTATGTGCCCGGGGTGGTTGGCGGG ATCTTGCAGGCAGCCACCATCTGTGCCTCCAGCCACCAATTCCTCTCCAC ACACTACAACCTCCACAACCTGTACGGCCTCACTGAAGCTATCGCCTCCA GCAGGGCCCTGGTCAAGACTCGGGGAACACGACCCTTTGTGATCTCCCGC TCAACCTTCTCGGGCCACGGCCGGTACGCTGGTCACTGGACAGGGGATGT GCGGAGCTCTTGGGAGCATCTTGCATACTCTGTGCCAGACATCCTGCAGT TCAACCTGCTGGGCGTGCCCCTGGTCGGGGCGGACATCTGCGGCTTCATA GGAGACACGTCAGAAGAGCTGTGTGTGCGCTGGACCCAGTTGGGGGCCTT CTACCCCTTCATGCGGAACCACAATGACCTGAATAGCGTGCCTCAGGAGC CGTACAGGTTCAGCGAGACGGCGCAGCAGGCCATGAGGAAGGCCTTCGCC TTACGCTATGCCCTTCTGCCCTACCTGTACACTCTCTTCCACCGCGCCCA CGTCAGAGGAGACACGGTGGCCCGGCCCCTCTTCCTGGAGTTCCCTGAGG ATCCCAGCACCTGGTCTGTGGACCGCCAGCTCTTGTGGGGGCCGGCCCTG CTCATCACACCTGTGCTTGAGCCTGGGAAAACTGAAGTGACGGGCTACTT CCCCAAGGGCACGTGGTACAACATGCAGGTGGTGTCAGTGGATTCCCTCG GTACTCTCCCTTCTCCATCATCGGCTTCATCCTTCAGATCTGCTGTCCAG AGCAAGGGGCAGTGGCTGACACTGGAAGCCCCACTGGATACCATCAACGT GCACCTGAGGGAGGGGTACATCATACCGCTGCAGGGTCCCAGCCTCACAA CCACGGAGTCCCGAAAGCAGCCCATGGCTCTGGCTGTGGCATTAACAGCA AGCGGCGAGGCCGATGGGGAGCTGTTCTGGGACGACGGGGAGAGCCTTGC AGTTCTGGAGCGTGGGGCCTACACACTGGTCACCTTCTCAGCCAAGAACA ATACCATTGTGAACAAGTTAGTGCGTGTGACCAAGGAGGGAGCTGAGCTA CAACTGAGGGAGGTGACCGTCTTGGGAGTGGCCACAGCTCCTACCCAGGT CCTTTCCAACGGCATCCCTGTCTCCAATTTCACCTACAGCCCTGACAACA AGAGCCTGGCCATCCCTGTCTCACTGCTGATGGGAGAGCTGTTTCAAATC AGCTGGTCCTAG

Mouse GAA amino acid sequence: MNIRKPLCSNSVVGACTLISLTTAVILGHLMLRELMLLPQDLHESSSGLW KTYRPHHQEGYKPGPLHIQEQTEQPKEAPTQCDVPPSSRFDCAPDKGISQ EQCEARGCCYVPAGQVLKEPQIGQPWCFFPPSYPSYRLENLSSTESGYTA TLTRTSPTFFPKDVLTLQLEVLMETDSRLHFKIKDPASKRYEVPLETPRV LSQAPSPLYSVEFSEEPFGVIVRRKLGGRVLLNTTVAPLFFADQFLQLST SLPSQHITGLGEHLSPLMLSTDWARITLWNRDTPPSQGTNLYGSHPFYLA LEDGGLAHGVFLLNSNAMDVILQPSPALTWRSTGGILDVYVFLGPEPKSV VQQYLDVVGYPFMPPYWGLGFHLCRWGYSSTAIVRQVVENMTRTHFPLDV QWNDLDYMDARRDFTFNQDSFADFPDMVRELHQGGRRYMMIVDPAISSAG PAGSYRPYDEGLRRGVFITNETGQPLIGKVWPGTTAFPDFTNPETLDWWQ DMVSEFHAQVPFDGMWLDMNEPSNFVRGSQQGCPNNELENPPYVPGVVGG ILQAATICASSHQFLSTHYNLHNLYGLTEAIASSRALVKTRGTRPFVISR STFSGHGRYAGHWTGDVRSSWEHLAYSVPDILQFNLLGVPLVGADICGFI GDTSEELCVRWTQLGAFYPFMRNHNDLNSVPQEPYRFSETAQQAMRKAFA LRYALLPYLYTLFHRAHVRGDTVARPLFLEFPEDPSTWSVDRQLLWGPAL LITPVLEPGKTEVTGYFPKGTWYNMQVVSVDSLGTLPSPSSASSFRSAVQ SKGQWLTLEAPLDTINVHLREGYIIPLQGPSLTTTESRKQPMALAVALTA SGEADGELFWDDGESLAVLERGAYTLVTFSAKNNTIVNKLVRVTKEGAEL QLREVTVLGVATAPTQVLSNGIPVSNFTYSPDNKSLAIPVSLLMGELFQI SWS.

The chimeric GAA cassettes were transfected into HEK293 cells as described in Example 1. GAA expression levels were determined from two stable transfectants. As shown in Table 8, fusion at position 881 gives the highest enzyme expression levels. Western analysis of the position 881 fusion hybrid shows that the expressed precursor protein is of similar size to wild-type GAA. TABLE 9 Human/Mouse GAA Hybrids Expression. Stable GAA expression nmol/hr-ml Fusion position Average of two lines 791 31 796 20 816 11 881 83 920 5

Further experiments were carried out to determine if the presence of the mouse GAA sequence at the C-terminus of the hybrids was able to accommodate the presence of the GILT tag. Acordingly, GILT Δ1-7 tag was fused to the C-terminus of each of the five full-length human/mouse hybrids listed above and the expression levels were determined in each case. Constructs were also made to combine the C-terminal position 881 mouse GAA hybrid with an N-terminal GILT tag at positions 29, 56, 70, or 81. The expression levels were determined as described above. 

1. A polypeptide comprising an amino acid sequence at least 50% identical to amino acid residues 70-790 of human acid alpha-glucosidase (GAA) or a fragment thereof and a targeting domain that binds an extracellular domain of a receptor on the surface of a target cell, wherein the polypeptide is not associated with a sequence at least 50% identical to amino acid residues 880-952 of human GAA.
 2. (canceled)
 3. The polypeptide of claim 1, wherein the targeting domain permits localization of the polypeptide in a human lysosome. 4-16. (canceled)
 17. A targeted therapeutic comprising a therapeutic agent comprising at least a portion of human acid alpha-glucosidase (GAA) and a peptide tag, wherein the peptide tag is directly or indirectly linked to an amino acid residue selected from the group consisting of amino acid residues 68, 69, 70, 71, 72, 789, 790, 791, 792, and 793 of human GAA.
 18. The targeted therapeutic of claim 17, wherein the amino acid residue is selected from the group consisting of amino acid residues 68-72 of human GAA.
 19. The targeted therapeutic of claim 17, wherein the therapeutic agent comprises amino acid residues 70-952 of human GAA.
 20. The targeted therapeutic of claim 19, wherein the peptide tag is linked to amino acid residue 70 of human GAA.
 21. The targeted therapeutic of claim 17, wherein the peptide tag is a ligand for an extracellular receptor.
 22. The targeted therapeutic of claim 21, wherein the peptide tag is a targeting domain that binds an extracellular domain of a receptor on the surface of a target cell and, upon internalization of the receptor, permits localization of the therapeutic agent to a human lysosome.
 23. The targeted therapeutic of claim 22, wherein the targeting domain comprises a urokinase-type plasminogen receptor moiety.
 24. The targeted therapeutic of claim 22, wherein the peptide tag comprises amino acid residues 48-55 of human IGF-II.
 25. The targeted therapeutic of claim 22, wherein the peptide tag comprises amino acid residues 8-28 and 41-61 of human IGF-II, or a sequence variant thereof that binds the human cation-independent mannose-6-phosphate receptor.
 26. The targeted therapeutic of claim 22, wherein the peptide tag comprises amino acid residues 8-87 of human IGF-II or a sequence variant thereof, wherein the sequence variant binds the human cation-independent mannose-6-phosphate receptor.
 27. The targeted therapeutic of claim 26, wherein the peptide tag comprises an R68A mutation of human IGF-II.
 28. The targeted therapeutic of claim 24, wherein the targeting domain comprises a C-terminally truncated form of human IGF-II or a sequence variant thereof, wherein the truncation begins at position
 62. 29. The targeted therapeutic of claim 17, further comprising a spacer between the therapeutic agent and the peptide tag.
 30. The targeted therapeutic of claim 29, wherein the spacer comprises glycine residues.
 31. The polypeptide of claim 29, wherein the spacer comprises an α-helical structure.
 32. The polypeptide of claim 29, wherein the spacer comprises an amino acid sequence at least 50% identical to Gly-Gly-Gly-Gly-Thr-Val-Gly-Asp-Asp-Asp-Asp-Lys (SEQ ID NO:1).
 33. A nucleic acid sequence comprising an open reading frame of a polypeptide comprising an amino acid sequence at least 50% identical to amino acid residues 70-790 of human acid alpha-glucosidase (GAA) or a fragment thereof, wherein the open reading frame does not include an amino acid sequence at least 50% identical to amino acid residues 880-952 of human GAA.
 34. (canceled)
 35. (canceled)
 36. The nucleic acid sequence of claim 33, wherein the polypeptide further comprises a targeting domain that binds an extracellular domain of a receptor on the surface of a target cell and, upon internalization of the receptor, permits localization of the polypeptide to a human lysosome. 37-46. (canceled) 